This is an archive of the discontinued LLVM Phabricator instance.

This is a good idea in theory, but I'm somewhat concerned that this will backfire in practice because assumes often pessimize optimization. For example LLVM has a really bad offender where it inserts assumes for alignment annotations during inlining -- thankfully there's an option to disable it, because it's a major perf regression otherwise.

This feels like it should be less problematic though.

In D61409#1487366, @nikic wrote:

This is a good idea in theory, but I'm somewhat concerned that this will backfire in practice because assumes often pessimize optimization. For example LLVM has a really bad offender where it inserts assumes for alignment annotations during inlining -- thankfully there's an option to disable it, because it's a major perf regression otherwise.

Pessimization of the final produced code, or of the llvm itself?

This feels like it should be less problematic though.

In D61409#1487372, @lebedev.ri wrote:

In D61409#1487366, @nikic wrote:

This is a good idea in theory, but I'm somewhat concerned that this will backfire in practice because assumes often pessimize optimization. For example LLVM has a really bad offender where it inserts assumes for alignment annotations during inlining -- thankfully there's an option to disable it, because it's a major perf regression otherwise.

Pessimization of the final produced code, or of the llvm itself?

Of the final produced code. I think it's mainly because assumes cause one-use checks to fail.

In D61409#1487375, @nikic wrote:

In D61409#1487372, @lebedev.ri wrote:

In D61409#1487366, @nikic wrote:

This is a good idea in theory, but I'm somewhat concerned that this will backfire in practice because assumes often pessimize optimization. For example LLVM has a really bad offender where it inserts assumes for alignment annotations during inlining -- thankfully there's an option to disable it, because it's a major perf regression otherwise.

Pessimization of the final produced code, or of the llvm itself?

Of the final produced code. I think it's mainly because assumes cause one-use checks to fail.

Oh, good point. It would make sense to not count @llvm.assume() during use-checking,
but then this isn't just about the call void @llvm.assume(), but about all the instructions
that would be dropped when the call is dropped during back-end lowering...

Since llvm.assume is not a really true use, oneUse check fails? Then we should really fix it to not count it.

xbolva00 added a reviewer: RKSimon.May 2 2019, 9:22 AM

It would be good if anybody familiar with llvm patternmatch could fix m_OneUse to ignore llvm.assume.

Then this patch will not cause any problems or regressions..

@spatel @RKSimon

In D61409#1488239, @xbolva00 wrote:

It would be good if anybody familiar with llvm patternmatch could fix m_OneUse to ignore llvm.assume.

In D61409#1487377, @lebedev.ri wrote:

In D61409#1487375, @nikic wrote:

In D61409#1487372, @lebedev.ri wrote:

In D61409#1487366, @nikic wrote:

This is a good idea in theory, but I'm somewhat concerned that this will backfire in practice because assumes often pessimize optimization. For example LLVM has a really bad offender where it inserts assumes for alignment annotations during inlining -- thankfully there's an option to disable it, because it's a major perf regression otherwise.

Pessimization of the final produced code, or of the llvm itself?

Of the final produced code. I think it's mainly because assumes cause one-use checks to fail.

Oh, good point. It would make sense to not count @llvm.assume() during use-checking,
but then this isn't just about the call void @llvm.assume(), but about all the instructions
that would be dropped when the call is dropped during back-end lowering...

I.e. just ignoring only the @llvm.assume() itself is only tip of the iceberg.

In D61409#1488239, @xbolva00 wrote:

Then this patch will not cause any problems or regressions..

@spatel @RKSimon

In D61409#1488242, @lebedev.ri wrote:

In D61409#1488239, @xbolva00 wrote:

It would be good if anybody familiar with llvm patternmatch could fix m_OneUse to ignore llvm.assume.

In D61409#1487377, @lebedev.ri wrote:

In D61409#1487375, @nikic wrote:

In D61409#1487372, @lebedev.ri wrote:

In D61409#1487366, @nikic wrote:

This is a good idea in theory, but I'm somewhat concerned that this will backfire in practice because assumes often pessimize optimization. For example LLVM has a really bad offender where it inserts assumes for alignment annotations during inlining -- thankfully there's an option to disable it, because it's a major perf regression otherwise.

Pessimization of the final produced code, or of the llvm itself?

Of the final produced code. I think it's mainly because assumes cause one-use checks to fail.

Oh, good point. It would make sense to not count @llvm.assume() during use-checking,
but then this isn't just about the call void @llvm.assume(), but about all the instructions
that would be dropped when the call is dropped during back-end lowering...

I.e. just ignoring only the @llvm.assume() itself is only tip of the iceberg.

In D61409#1488239, @xbolva00 wrote:

Then this patch will not cause any problems or regressions..

@spatel @RKSimon

I agree that this is a (much) bigger than fixing the pattern matcher. I have no instinct or data to say how much inserted assumes can/do harm optimization, but this is a long-standing problem with that implementation. @hfinkel or @reames may have a better idea.

Any advices about next steps for this patch?

if inserted llvm.assume is a big problem and “fixing” oneUse match check is not proper solution, then this patch is dead..

In D61409#1490577, @xbolva00 wrote:

Any advices about next steps for this patch?

if inserted llvm.assume is a big problem and “fixing” oneUse match check is not proper solution, then this patch is dead..

In general, automatically llvm.assume much be done with extreme care, as it can pessimize optimizations. We try do this only in cases where the user as specifically indicated that there is some information relevant to optimization that is worth preserving. In this case, it might be worth doing, because the if (x) unreachable is the recommended "assume" pattern for use with GCC.

xbolva00 abandoned this revision.May 15 2019, 1:29 PM

In D61409#1490638, @hfinkel wrote:

In D61409#1490577, @xbolva00 wrote:

Any advices about next steps for this patch?

if inserted llvm.assume is a big problem and “fixing” oneUse match check is not proper solution, then this patch is dead..

In general, automatically llvm.assume much be done with extreme care, as it can pessimize optimizations. We try do this only in cases where the user as specifically indicated that there is some information relevant to optimization that is worth preserving.

In D61409#1490638, @hfinkel wrote:

In this case, it might be worth doing, because the if (x) unreachable is the recommended "assume" pattern for use with GCC.

It looked this patch was moving in the right direction?

Possibly. But more fixes are needed as preconditions to enable this, as you mentioned here (fix oneUse check). Without them, we cannot push this futher.

xbolva00 reclaimed this revision.May 15 2019, 1:57 PM

Just to be clear, I didn't mean to say that we shouldn't do this, just that it is not guaranteed to be a win in the end. If it's possible to do some end-to-end performance testing with this patch to make sure that it doesn't come with unexpected regressions, I see no reason not to move forward with this change. The problem with assumes is a long-standing one and I don't think it has any simple solution.

In D61409#1503626, @nikic wrote:

Just to be clear, I didn't mean to say that we shouldn't do this, just that it is not guaranteed to be a win in the end. If it's possible to do some end-to-end performance testing with this patch to make sure that it doesn't come with unexpected regressions, I see no reason not to move forward with this change. The problem with assumes is a long-standing one and I don't think it has any simple solution.

That was my take, too.

$ ../test-suite/utils/compare.py baseline.json
Tests: 1163
Metric: exec_time

Program baseline
LCALS/Subs...aw.test:BM_MAT_X_MAT_RAW/44217 868459.97
LCALS/Subs...test:BM_MAT_X_MAT_LAMBDA/44217 563058.65
ImageProce...t:BENCHMARK_GAUSSIAN_BLUR/1024 99965.67
ImageProce...HMARK_ANISTROPIC_DIFFUSION/256 77470.40
harris/har...est:BENCHMARK_HARRIS/2048/2048 52498.97
ImageProce...MARK_BICUBIC_INTERPOLATION/256 44359.86
LCALS/Subs...Raw.test:BM_MAT_X_MAT_RAW/5001 25499.45
ImageProce...st:BENCHMARK_GAUSSIAN_BLUR/512 24021.84
LCALS/Subs....test:BM_MAT_X_MAT_LAMBDA/5001 20025.00
ImageProce...HMARK_ANISTROPIC_DIFFUSION/128 18171.17
harris/har...est:BENCHMARK_HARRIS/1024/1024 15093.33
ImageProce...ate.test:BENCHMARK_DILATE/1024 12321.46
ImageProce...ARK_BILINEAR_INTERPOLATION/256 11589.59
ImageProce...MARK_BICUBIC_INTERPOLATION/128 10496.46
LCALS/Subs...Raw.test:BM_HYDRO_2D_RAW/44217 8448.96

baseline

count 1147.000000
mean 1727.619647
std 30862.929090
min 0.000000
25% 0.004000
50% 0.008800
75% 3.645378
max 868459.973000
$ ../test-suite/utils/compare.py patch.json
Tests: 1163
Metric: exec_time

Program patch
LCALS/Subs...test:BM_MAT_X_MAT_LAMBDA/44217 648771.10
LCALS/Subs...aw.test:BM_MAT_X_MAT_RAW/44217 549594.39
ImageProce...t:BENCHMARK_GAUSSIAN_BLUR/1024 112422.28
ImageProce...HMARK_ANISTROPIC_DIFFUSION/256 91073.89
harris/har...est:BENCHMARK_HARRIS/2048/2048 52924.20
ImageProce...MARK_BICUBIC_INTERPOLATION/256 50989.07
ImageProce...st:BENCHMARK_GAUSSIAN_BLUR/512 26065.95
LCALS/Subs...Raw.test:BM_MAT_X_MAT_RAW/5001 23588.53
ImageProce...HMARK_ANISTROPIC_DIFFUSION/128 21120.28
LCALS/Subs....test:BM_MAT_X_MAT_LAMBDA/5001 20901.71
ImageProce...ARK_BILINEAR_INTERPOLATION/256 15573.54
harris/har...est:BENCHMARK_HARRIS/1024/1024 14383.34
ImageProce...ate.test:BENCHMARK_DILATE/1024 12928.17
ImageProce...MARK_BICUBIC_INTERPOLATION/128 10768.69
LCALS/Subs...Raw.test:BM_HYDRO_2D_RAW/44217 9542.02

patch

count 1147.000000
mean 1571.500067
std 25580.275818
min 0.000000
25% 0.004000
50% 0.008900
75% 3.815259
max 648771.096000

You can use something like the command below to get a comparison between 2 sets of runs. Also, doing single runs only will probably result in quite noisy results.

./test-suite/utils/compare.py -m exec_time patch.json vs master.json

In D61409#1507829, @fhahn wrote:
You can use something like the command below to get a comparison between 2 sets of runs. Also, doing single runs only will probably result in quite noisy results.
./test-suite/utils/compare.py -m exec_time patch.json vs master.json

Not sure why this crashes for me:
/home/xbolva00/.local/lib/python2.7/site-packages/scipy/stats/stats.py:308: RuntimeWarning: divide by zero encountered in log

log_a = np.log(np.array(a, dtype=dtype))

/home/xbolva00/.local/lib/python2.7/site-packages/numpy/core/_methods.py:75: RuntimeWarning: invalid value encountered in reduce

ret = umr_sum(arr, axis, dtype, out, keepdims)

Anway, I used --filter-short.. I ran and compare it 3x..
https://pastebin.com/kjFBv2ZP

Yes, noise a bit but I think there are some stable improvements..

Ping

I thinks the results are okay..

These results look pretty noisy, so it's hard to say. Maybe someone who already has a good benchmarking setup (maybe @dmgreen?) could run some tests?

Independently of that, could you please precommit your new test, as well as the regenerated test checks? Especially for the LoopVectorize/if-pred-stores.ll test it's not clear what has actually changed.

RKSimon edited reviewers, added: hfinkel, reames; removed: RKSimon.May 25 2019, 5:04 AM

Rebased on updated test checks..

Some tests are failing
Failing Tests (4):

LLVM :: CodeGen/Hexagon/bit-visit-flowq.ll
LLVM :: CodeGen/Hexagon/rdf-ignore-undef.ll
LLVM :: CodeGen/Hexagon/reg-scavengebug.ll
LLVM :: CodeGen/Hexagon/regalloc-block-overlap.ll

The root cause: tests contain "br undef" and this is now optimized away. Not sure how to progress here..

Herald added a subscriber: javed.absar. · View Herald TranscriptMay 25 2019, 6:26 AM

The root cause: tests contain "br undef" and this is now optimized away. Not sure how to progress here..

This is a common problem with bugpoint reduced tests. You'll have to update these tests (as well as if-pred-stores.ll) either by replacing unreachables with something else (say a dummy return), or replacing the undef branch with a proper condition (e.g. pass in a bool as arg), while trying to preserve the original intention of the test.

nikic mentioned this in rL361697: [NFC] Make tests more robust for new optimizations.May 25 2019, 7:16 AM

Rebased

This LGTM. Please watch the LNT bots carefully and revert if we find performance regressions. A fear of performance regressions is specifically why we didn't do this when the assumption facility was first introduced.

This revision is now accepted and ready to land.May 25 2019, 1:41 PM

Thanks!

I will wait a few days in case @dmgreen wants to do more benchmarks.

Hello. Looks fine from the tests I ran. For codesize too, which can be a good way to get low-noise indication of how things change.

Closed by commit rL361707: [SimplifyCFG] Added condition assumption for unreachable blocks (authored by xbolva00). · Explain WhyMay 25 2019, 3:32 PM

This revision was automatically updated to reflect the committed changes.

Thank you!

Commited.

Please also see https://bugs.llvm.org/show_bug.cgi?id=42019 if you have any ideas how to handle intrinsics in use-checking code.

spatel mentioned this in D97306: [SimplifyCFG] avoid creating unnecessary assume calls.Feb 23 2021, 9:38 AM

spatel mentioned this in D97244: [SimplifyCFG] Update passingValueIsAlwaysUndefined to check more attributes.Feb 23 2021, 11:44 AM

Revision Contents

Path

Size

lib/

Transforms/

Utils/

SimplifyCFG.cpp

3 lines

test/

Analysis/

ValueTracking/

select-pattern.ll

2 lines

CodeGen/

ARM/

crash-greedy.ll

3 lines

Transforms/

CallSiteSplitting/

split-loop.ll

9 lines

LoopVectorize/

if-pred-stores.ll

32 lines

SimplifyCFG/

PR30210.ll

4 lines

UnreachableEliminate.ll

12 lines

unreachable_assume.ll

15 lines

Diff 201398

lib/Transforms/Utils/SimplifyCFG.cpp

Show First 20 Lines • Show All 4,199 Lines • ▼ Show 20 Lines	for (unsigned i = 0, e = Preds.size(); i != e; ++i) {
if (auto *BI = dyn_cast<BranchInst>(TI)) {		if (auto *BI = dyn_cast<BranchInst>(TI)) {
if (BI->isUnconditional()) {		if (BI->isUnconditional()) {
if (BI->getSuccessor(0) == BB) {		if (BI->getSuccessor(0) == BB) {
new UnreachableInst(TI->getContext(), TI);		new UnreachableInst(TI->getContext(), TI);
TI->eraseFromParent();		TI->eraseFromParent();
Changed = true;		Changed = true;
}		}
} else {		} else {
		Value* Cond = BI->getCondition();
if (BI->getSuccessor(0) == BB) {		if (BI->getSuccessor(0) == BB) {
		Builder.CreateAssumption(Builder.CreateNot(Cond));
Builder.CreateBr(BI->getSuccessor(1));		Builder.CreateBr(BI->getSuccessor(1));
EraseTerminatorAndDCECond(BI);		EraseTerminatorAndDCECond(BI);
} else if (BI->getSuccessor(1) == BB) {		} else if (BI->getSuccessor(1) == BB) {
		Builder.CreateAssumption(Cond);
Builder.CreateBr(BI->getSuccessor(0));		Builder.CreateBr(BI->getSuccessor(0));
EraseTerminatorAndDCECond(BI);		EraseTerminatorAndDCECond(BI);
Changed = true;		Changed = true;
}		}
}		}
} else if (auto *SI = dyn_cast<SwitchInst>(TI)) {		} else if (auto *SI = dyn_cast<SwitchInst>(TI)) {
for (auto i = SI->case_begin(), e = SI->case_end(); i != e;) {		for (auto i = SI->case_begin(), e = SI->case_end(); i != e;) {
if (i->getCaseSuccessor() != BB) {		if (i->getCaseSuccessor() != BB) {
▲ Show 20 Lines • Show All 1,895 Lines • Show Last 20 Lines

test/Analysis/ValueTracking/select-pattern.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -simplifycfg < %s -S \| FileCheck %s			; RUN: opt -simplifycfg < %s -S \| FileCheck %s

	; The dead code would cause a select that had itself			; The dead code would cause a select that had itself
	; as an operand to be analyzed. This would then cause			; as an operand to be analyzed. This would then cause
	; infinite recursion and eventual crash.			; infinite recursion and eventual crash.

	define void @PR36045(i1 %t, i32* %b) {			define void @PR36045(i1 %t, i32* %b) {
	; CHECK-LABEL: @PR36045(			; CHECK-LABEL: @PR36045(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = xor i1 [[T:%.]], true
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP0]])
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br i1 %t, label %if, label %end			br i1 %t, label %if, label %end

	if:			if:
	br i1 %t, label %unreach, label %pre			br i1 %t, label %unreach, label %pre

	Show All 29 Lines

test/CodeGen/ARM/crash-greedy.ll

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -regalloc=greedy -mcpu=cortex-a8 -relocation-model=pic -frame-pointer=all -verify-machineinstrs \| FileCheck %s			; RUN: llc < %s -regalloc=greedy -mcpu=cortex-a8 -relocation-model=pic -frame-pointer=all -verify-machineinstrs \| FileCheck %s
	;			;
	; ARM tests that crash or fail with the greedy register allocator.			; ARM tests that crash or fail with the greedy register allocator.

	target triple = "thumbv7-apple-darwin"			target triple = "thumbv7-apple-darwin"

	declare double @exp(double)			declare double @exp(double)

	Show All 16 Lines

	for.end: ; preds = %cond.end			for.end: ; preds = %cond.end
	%conv78 = sitofp i32 %z to double			%conv78 = sitofp i32 %z to double
	%conv83 = fpext float %c to double			%conv83 = fpext float %c to double
	%mul84 = fmul double %mul17, %conv83			%mul84 = fmul double %mul17, %conv83
	%call85 = tail call double @exp(double %mul84) nounwind			%call85 = tail call double @exp(double %mul84) nounwind
	%mul86 = fmul double %conv78, %call85			%mul86 = fmul double %conv78, %call85
	%add88 = fadd double 0.000000e+00, %mul86			%add88 = fadd double 0.000000e+00, %mul86
	; CHECK: bl _exp
	%call100 = tail call double @exp(double %mul84) nounwind			%call100 = tail call double @exp(double %mul84) nounwind
	%mul101 = fmul double undef, %call100			%mul101 = fmul double undef, %call100
	%add103 = fadd double %add46, %mul101			%add103 = fadd double %add46, %mul101
	%mul111 = fmul double undef, %conv83			%mul111 = fmul double undef, %conv83
	%mul119 = fmul double %mul111, undef			%mul119 = fmul double %mul111, undef
	%add121 = fadd double undef, %mul119			%add121 = fadd double undef, %mul119
	%div = fdiv double 1.000000e+00, %conv16			%div = fdiv double 1.000000e+00, %conv16
	%div126 = fdiv double %add, %add75			%div126 = fdiv double %add, %add75
	%sub = fsub double %div, %div126			%sub = fsub double %div, %div126
	%div129 = fdiv double %add103, %add88			%div129 = fdiv double %add103, %add88
	%add130 = fadd double %sub, %div129			%add130 = fadd double %sub, %div129
	%conv131 = fptrunc double %add130 to float			%conv131 = fptrunc double %add130 to float
	store float %conv131, float* %ret_f, align 4			store float %conv131, float* %ret_f, align 4
	%mul139 = fmul double %div129, %div129			%mul139 = fmul double %div129, %div129
	%div142 = fdiv double %add121, %add88			%div142 = fdiv double %add121, %add88
	%sub143 = fsub double %mul139, %div142			%sub143 = fsub double %mul139, %div142
	; %lambda is passed on the stack, and the stack slot load is rematerialized.			; %lambda is passed on the stack, and the stack slot load is rematerialized.
	; The rematted load of a float constrains the D register used for the mul.			; The rematted load of a float constrains the D register used for the mul.
	; CHECK: vldr
	%mul146 = fmul float %lambda, %lambda			%mul146 = fmul float %lambda, %lambda
	%conv147 = fpext float %mul146 to double			%conv147 = fpext float %mul146 to double
	%div148 = fdiv double 1.000000e+00, %conv147			%div148 = fdiv double 1.000000e+00, %conv147
	%sub149 = fsub double %sub143, %div148			%sub149 = fsub double %sub143, %div148
	%conv150 = fptrunc double %sub149 to float			%conv150 = fptrunc double %sub149 to float
	store float %conv150, float* %ret_df, align 4			store float %conv150, float* %ret_df, align 4
	ret void			ret void
	}			}
	▲ Show 20 Lines • Show All 70 Lines • Show Last 20 Lines

test/Transforms/CallSiteSplitting/split-loop.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt -S -callsite-splitting -simplifycfg < %s \| FileCheck %s		; RUN: opt -S -callsite-splitting -simplifycfg < %s \| FileCheck %s

define i16 @test1() {		define i16 @test1() {
; CHECK-LABEL: @test1(		; CHECK-LABEL: @test1(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[SPEC_SELECT:%.*]] = select i1 undef, i16 1, i16 0		; CHECK-NEXT: [[SPEC_SELECT:%.*]] = select i1 undef, i16 1, i16 0
		; CHECK-NEXT: [[TOBOOL18:%.*]] = icmp ne i16 [[SPEC_SELECT]], 0
		; CHECK-NEXT: [[TMP0:%.*]] = xor i1 [[TOBOOL18]], true
		; CHECK-NEXT: call void @llvm.assume(i1 [[TMP0]])
; CHECK-NEXT: br label [[FOR_COND12:%.*]]		; CHECK-NEXT: br label [[FOR_COND12:%.*]]
; CHECK: for.cond12:		; CHECK: for.cond12:
; CHECK-NEXT: call void @callee(i16 [[SPEC_SELECT]])		; CHECK-NEXT: call void @callee(i16 [[SPEC_SELECT]])
; CHECK-NEXT: br label [[FOR_COND12]]		; CHECK-NEXT: br label [[FOR_COND12]]
;		;
entry:		entry:
%spec.select = select i1 undef, i16 1, i16 0		%spec.select = select i1 undef, i16 1, i16 0
%tobool18 = icmp ne i16 %spec.select, 0		%tobool18 = icmp ne i16 %spec.select, 0
br i1 %tobool18, label %for.cond12.us, label %for.cond12		br i1 %tobool18, label %for.cond12.us, label %for.cond12

for.cond12.us:		for.cond12.us:
unreachable		unreachable

for.cond12:		for.cond12:
call void @callee(i16 %spec.select)		call void @callee(i16 %spec.select)
br label %for.cond12		br label %for.cond12
}		}

define i16 @test2() {		define i16 @test2() {
; CHECK-LABEL: @test2(		; CHECK-LABEL: @test2(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[S:%.*]] = select i1 undef, i16 1, i16 0		; CHECK-NEXT: [[S:%.*]] = select i1 undef, i16 1, i16 0
		; CHECK-NEXT: [[TOBOOL18:%.*]] = icmp ne i16 [[S]], 0
		; CHECK-NEXT: [[TMP0:%.*]] = xor i1 [[TOBOOL18]], true
		; CHECK-NEXT: call void @llvm.assume(i1 [[TMP0]])
; CHECK-NEXT: br label [[FOR_COND12:%.*]]		; CHECK-NEXT: br label [[FOR_COND12:%.*]]
; CHECK: for.cond12:		; CHECK: for.cond12:
; CHECK-NEXT: call void @callee(i16 [[S]])		; CHECK-NEXT: call void @callee(i16 [[S]])
; CHECK-NEXT: [[ADD:%.*]] = add i16 [[S]], 10		; CHECK-NEXT: [[ADD:%.*]] = add i16 [[S]], 10
; CHECK-NEXT: [[ADD2:%.*]] = add i16 [[S]], 10		; CHECK-NEXT: [[ADD2:%.*]] = add i16 [[S]], 10
; CHECK-NEXT: br label [[FOR_COND12]]		; CHECK-NEXT: br label [[FOR_COND12]]
;		;
entry:		entry:
Show All 10 Lines	for.cond12:
%add2 = add i16 %s, 10		%add2 = add i16 %s, 10
br label %for.cond12		br label %for.cond12
}		}

define i16 @test3() {		define i16 @test3() {
; CHECK-LABEL: @test3(		; CHECK-LABEL: @test3(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[S:%.*]] = select i1 undef, i16 1, i16 0		; CHECK-NEXT: [[S:%.*]] = select i1 undef, i16 1, i16 0
		; CHECK-NEXT: [[TOBOOL18:%.*]] = icmp ne i16 [[S]], 0
		; CHECK-NEXT: [[TMP0:%.*]] = xor i1 [[TOBOOL18]], true
		; CHECK-NEXT: call void @llvm.assume(i1 [[TMP0]])
; CHECK-NEXT: br label [[FOR_COND12:%.*]]		; CHECK-NEXT: br label [[FOR_COND12:%.*]]
; CHECK: for.cond12:		; CHECK: for.cond12:
; CHECK-NEXT: call void @callee(i16 [[S]])		; CHECK-NEXT: call void @callee(i16 [[S]])
; CHECK-NEXT: [[ADD:%.*]] = add i16 [[S]], 10		; CHECK-NEXT: [[ADD:%.*]] = add i16 [[S]], 10
; CHECK-NEXT: [[ADD2:%.*]] = add i16 [[ADD]], 10		; CHECK-NEXT: [[ADD2:%.*]] = add i16 [[ADD]], 10
; CHECK-NEXT: br i1 undef, label [[FOR_COND12]], label [[EXIT:%.*]]		; CHECK-NEXT: br i1 undef, label [[FOR_COND12]], label [[EXIT:%.*]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret i16 [[ADD2]]		; CHECK-NEXT: ret i16 [[ADD2]]
Show All 20 Lines

test/Transforms/LoopVectorize/if-pred-stores.ll

	Show First 20 Lines • Show All 191 Lines • ▼ Show 20 Lines
	; Track basic blocks when unrolling conditional blocks. This code used to assert			; Track basic blocks when unrolling conditional blocks. This code used to assert
	; because we did not update the phi nodes with the proper predecessor in the			; because we did not update the phi nodes with the proper predecessor in the
	; vectorized loop body.			; vectorized loop body.
	; PR18724			; PR18724

	define void @bug18724() {			define void @bug18724() {
	; UNROLL-LABEL: @bug18724(			; UNROLL-LABEL: @bug18724(
	; UNROLL-NEXT: entry:			; UNROLL-NEXT: entry:
	; UNROLL-NEXT: br label [[FOR_BODY14:%.*]]			; UNROLL-NEXT: unreachable
	; UNROLL: for.body14:
	; UNROLL-NEXT: [[INDVARS_IV3:%.]] = phi i64 [ [[INDVARS_IV_NEXT4:%.]], [[FOR_INC23:%.]] ], [ undef, [[ENTRY:%.]] ]
	; UNROLL-NEXT: [[INEWCHUNKS_120:%.]] = phi i32 [ [[INEWCHUNKS_2:%.]], [[FOR_INC23]] ], [ undef, [[ENTRY]] ]
	; UNROLL-NEXT: [[ARRAYIDX16:%.]] = getelementptr inbounds [768 x i32], [768 x i32] undef, i64 0, i64 [[INDVARS_IV3]]
	; UNROLL-NEXT: [[TMP:%.]] = load i32, i32 [[ARRAYIDX16]], align 4
	; UNROLL-NEXT: br i1 undef, label [[IF_THEN18:%.*]], label [[FOR_INC23]]
	; UNROLL: if.then18:
	; UNROLL-NEXT: store i32 2, i32* [[ARRAYIDX16]], align 4
	; UNROLL-NEXT: [[INC21:%.*]] = add nsw i32 [[INEWCHUNKS_120]], 1
	; UNROLL-NEXT: br label [[FOR_INC23]]
	; UNROLL: for.inc23:
	; UNROLL-NEXT: [[INEWCHUNKS_2]] = phi i32 [ [[INC21]], [[IF_THEN18]] ], [ [[INEWCHUNKS_120]], [[FOR_BODY14]] ]
	; UNROLL-NEXT: [[INDVARS_IV_NEXT4]] = add nsw i64 [[INDVARS_IV3]], 1
	; UNROLL-NEXT: br label [[FOR_BODY14]]
	;			;
	; UNROLL-NOSIMPLIFY-LABEL: @bug18724(			; UNROLL-NOSIMPLIFY-LABEL: @bug18724(
	; UNROLL-NOSIMPLIFY-NEXT: entry:			; UNROLL-NOSIMPLIFY-NEXT: entry:
	; UNROLL-NOSIMPLIFY-NEXT: br label [[FOR_BODY9:%.*]]			; UNROLL-NOSIMPLIFY-NEXT: br label [[FOR_BODY9:%.*]]
	; UNROLL-NOSIMPLIFY: for.body9:			; UNROLL-NOSIMPLIFY: for.body9:
	; UNROLL-NOSIMPLIFY-NEXT: br i1 undef, label [[FOR_INC26:%.]], label [[FOR_BODY14_PREHEADER:%.]]			; UNROLL-NOSIMPLIFY-NEXT: br i1 undef, label [[FOR_INC26:%.]], label [[FOR_BODY14_PREHEADER:%.]]
	; UNROLL-NOSIMPLIFY: for.body14.preheader:			; UNROLL-NOSIMPLIFY: for.body14.preheader:
	; UNROLL-NOSIMPLIFY-NEXT: br i1 true, label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; UNROLL-NOSIMPLIFY-NEXT: br i1 true, label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines
	; UNROLL-NOSIMPLIFY-NEXT: [[INEWCHUNKS_2_LCSSA:%.*]] = phi i32 [ [[INEWCHUNKS_2]], [[FOR_INC23]] ], [ [[BIN_RDX]], [[MIDDLE_BLOCK]] ]			; UNROLL-NOSIMPLIFY-NEXT: [[INEWCHUNKS_2_LCSSA:%.*]] = phi i32 [ [[INEWCHUNKS_2]], [[FOR_INC23]] ], [ [[BIN_RDX]], [[MIDDLE_BLOCK]] ]
	; UNROLL-NOSIMPLIFY-NEXT: br label [[FOR_INC26]]			; UNROLL-NOSIMPLIFY-NEXT: br label [[FOR_INC26]]
	; UNROLL-NOSIMPLIFY: for.inc26:			; UNROLL-NOSIMPLIFY: for.inc26:
	; UNROLL-NOSIMPLIFY-NEXT: [[INEWCHUNKS_1_LCSSA:%.*]] = phi i32 [ undef, [[FOR_BODY9]] ], [ [[INEWCHUNKS_2_LCSSA]], [[FOR_INC26_LOOPEXIT]] ]			; UNROLL-NOSIMPLIFY-NEXT: [[INEWCHUNKS_1_LCSSA:%.*]] = phi i32 [ undef, [[FOR_BODY9]] ], [ [[INEWCHUNKS_2_LCSSA]], [[FOR_INC26_LOOPEXIT]] ]
	; UNROLL-NOSIMPLIFY-NEXT: unreachable			; UNROLL-NOSIMPLIFY-NEXT: unreachable
	;			;
	; VEC-LABEL: @bug18724(			; VEC-LABEL: @bug18724(
	; VEC-NEXT: entry:			; VEC-NEXT: entry:
	; VEC-NEXT: br label [[FOR_BODY14:%.*]]			; VEC-NEXT: unreachable
	; VEC: for.body14:
	; VEC-NEXT: [[INDVARS_IV3:%.]] = phi i64 [ [[INDVARS_IV_NEXT4:%.]], [[FOR_INC23:%.]] ], [ undef, [[ENTRY:%.]] ]
	; VEC-NEXT: [[INEWCHUNKS_120:%.]] = phi i32 [ [[INEWCHUNKS_2:%.]], [[FOR_INC23]] ], [ undef, [[ENTRY]] ]
	; VEC-NEXT: [[ARRAYIDX16:%.]] = getelementptr inbounds [768 x i32], [768 x i32] undef, i64 0, i64 [[INDVARS_IV3]]
	; VEC-NEXT: [[TMP:%.]] = load i32, i32 [[ARRAYIDX16]], align 4
	; VEC-NEXT: br i1 undef, label [[IF_THEN18:%.*]], label [[FOR_INC23]]
	; VEC: if.then18:
	; VEC-NEXT: store i32 2, i32* [[ARRAYIDX16]], align 4
	; VEC-NEXT: [[INC21:%.*]] = add nsw i32 [[INEWCHUNKS_120]], 1
	; VEC-NEXT: br label [[FOR_INC23]]
	; VEC: for.inc23:
	; VEC-NEXT: [[INEWCHUNKS_2]] = phi i32 [ [[INC21]], [[IF_THEN18]] ], [ [[INEWCHUNKS_120]], [[FOR_BODY14]] ]
	; VEC-NEXT: [[INDVARS_IV_NEXT4]] = add nsw i64 [[INDVARS_IV3]], 1
	; VEC-NEXT: br label [[FOR_BODY14]]
	;			;
	entry:			entry:
	br label %for.body9			br label %for.body9

	for.body9:			for.body9:
	br i1 undef, label %for.inc26, label %for.body14			br i1 undef, label %for.inc26, label %for.body14

	for.body14:			for.body14:
	▲ Show 20 Lines • Show All 226 Lines • Show Last 20 Lines

test/Transforms/SimplifyCFG/PR30210.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt -S -simplifycfg < %s \| FileCheck %s		; RUN: opt -S -simplifycfg < %s \| FileCheck %s
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"		target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"		target triple = "x86_64-unknown-linux-gnu"

declare i32* @fn1(i32* returned)		declare i32* @fn1(i32* returned)

define i32 @test1(i1 %B) {		define i32 @test1(i1 %B) {
; CHECK-LABEL: @test1(		; CHECK-LABEL: @test1(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: br label [[FOR_COND_US:%.*]]		; CHECK-NEXT: br label [[FOR_COND_US:%.*]]
; CHECK: for.cond.us:		; CHECK: for.cond.us:
		; CHECK-NEXT: [[TMP0:%.]] = xor i1 [[B:%.]], true
		; CHECK-NEXT: call void @llvm.assume(i1 [[TMP0]])
; CHECK-NEXT: br label [[FOR_COND_US]]		; CHECK-NEXT: br label [[FOR_COND_US]]
;		;
entry:		entry:
br label %for.cond.us		br label %for.cond.us

for.cond.us: ; preds = %for.cond.us, %entry		for.cond.us: ; preds = %for.cond.us, %entry
br i1 %B, label %for.cond4.preheader, label %for.cond.us		br i1 %B, label %for.cond4.preheader, label %for.cond.us

Show All 9 Lines	for.cond4: ; preds = %for.end, %for.cond4.preheader
br label %for.cond5		br label %for.cond5

for.cond5: ; preds = %for.cond5, %for.cond4		for.cond5: ; preds = %for.cond5, %for.cond4
br i1 %B, label %for.cond5, label %for.end		br i1 %B, label %for.cond5, label %for.end

for.end: ; preds = %for.cond5		for.end: ; preds = %for.cond5
%load = load i32, i32* %call, align 4		%load = load i32, i32* %call, align 4
br label %for.cond4		br label %for.cond4
}		}
No newline at end of file

test/Transforms/SimplifyCFG/UnreachableEliminate.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -simplifycfg -S \| FileCheck %s		; RUN: opt < %s -simplifycfg -S \| FileCheck %s

define void @test1(i1 %C, i1* %BP) {		define void @test1(i1 %C, i1* %BP) {
; CHECK-LABEL: @test1(		; CHECK-LABEL: @test1(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[TMP0:%.]] = xor i1 [[C:%.]], true
		; CHECK-NEXT: call void @llvm.assume(i1 [[TMP0]])
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
br i1 %C, label %T, label %F		br i1 %C, label %T, label %F
T:		T:
store i1 %C, i1* %BP		store i1 %C, i1* %BP
unreachable		unreachable
F:		F:
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines

;; We can either convert the following control-flow to a select or remove the		;; We can either convert the following control-flow to a select or remove the
;; unreachable control flow because of the undef store of null. Make sure we do		;; unreachable control flow because of the undef store of null. Make sure we do
;; the latter.		;; the latter.

define void @test5(i1 %cond, i8* %ptr) {		define void @test5(i1 %cond, i8* %ptr) {
; CHECK-LABEL: @test5(		; CHECK-LABEL: @test5(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[TMP0:%.]] = xor i1 [[COND:%.]], true
		; CHECK-NEXT: call void @llvm.assume(i1 [[TMP0]])
; CHECK-NEXT: store i8 2, i8* [[PTR:%.*]], align 8		; CHECK-NEXT: store i8 2, i8* [[PTR:%.*]], align 8
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;

entry:		entry:
br i1 %cond, label %bb1, label %bb3		br i1 %cond, label %bb1, label %bb3

bb3:		bb3:
Show All 29 Lines	bb2:
%ptr.2 = phi i8* [ %ptr, %bb3 ], [ null, %bb1 ]		%ptr.2 = phi i8* [ %ptr, %bb3 ], [ null, %bb1 ]
store i8 2, i8* %ptr.2, align 8		store i8 2, i8* %ptr.2, align 8
ret void		ret void
}		}

define void @test6(i1 %cond, i8* %ptr) {		define void @test6(i1 %cond, i8* %ptr) {
; CHECK-LABEL: @test6(		; CHECK-LABEL: @test6(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[TMP0:%.]] = xor i1 [[COND:%.]], true
		; CHECK-NEXT: call void @llvm.assume(i1 [[TMP0]])
; CHECK-NEXT: store i8 2, i8* [[PTR:%.*]], align 8		; CHECK-NEXT: store i8 2, i8* [[PTR:%.*]], align 8
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
br i1 %cond, label %bb1, label %bb2		br i1 %cond, label %bb1, label %bb2

bb1:		bb1:
br label %bb2		br label %bb2
Show All 22 Lines	bb2:
store i8 2, i8* %ptr.2, align 8		store i8 2, i8* %ptr.2, align 8
ret void		ret void
}		}


define i32 @test7(i1 %X) {		define i32 @test7(i1 %X) {
; CHECK-LABEL: @test7(		; CHECK-LABEL: @test7(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[TMP0:%.]] = xor i1 [[X:%.]], true
		; CHECK-NEXT: call void @llvm.assume(i1 [[TMP0]])
; CHECK-NEXT: ret i32 0		; CHECK-NEXT: ret i32 0
;		;
entry:		entry:
br i1 %X, label %if, label %else		br i1 %X, label %if, label %else

if:		if:
call void undef()		call void undef()
br label %else		br label %else

else:		else:
%phi = phi i32 [ 0, %entry ], [ 1, %if ]		%phi = phi i32 [ 0, %entry ], [ 1, %if ]
ret i32 %phi		ret i32 %phi
}		}

define void @test8(i1 %X, void ()* %Y) {		define void @test8(i1 %X, void ()* %Y) {
; CHECK-LABEL: @test8(		; CHECK-LABEL: @test8(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[TMP0:%.]] = xor i1 [[X:%.]], true
		; CHECK-NEXT: call void @llvm.assume(i1 [[TMP0]])
; CHECK-NEXT: call void [[Y:%.*]]()		; CHECK-NEXT: call void [[Y:%.*]]()
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
br i1 %X, label %if, label %else		br i1 %X, label %if, label %else

if:		if:
br label %else		br label %else
Show All 18 Lines	if:
br label %else		br label %else

else:		else:
%phi = phi void ()* [ %Y, %entry ], [ null, %if ]		%phi = phi void ()* [ %Y, %entry ], [ null, %if ]
call void %phi()		call void %phi()
ret void		ret void
}		}

attributes #0 = { "null-pointer-is-valid"="true" }		attributes #0 = { "null-pointer-is-valid"="true" }
No newline at end of file

test/Transforms/SimplifyCFG/unreachable_assume.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt %s -simplifycfg -instcombine -S \| FileCheck %s			; RUN: opt %s -simplifycfg -instcombine -S \| FileCheck %s

	; TODO: ABS call should be optimized away
	define i32 @assume1(i32 %p) {			define i32 @assume1(i32 %p) {
	; CHECK-LABEL: @assume1(			; CHECK-LABEL: @assume1(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[TMP0:%.]] = icmp slt i32 [[P:%.]], 0			; CHECK-NEXT: [[CMP:%.]] = icmp sgt i32 [[P:%.]], 0
	; CHECK-NEXT: [[NEG:%.*]] = sub nsw i32 0, [[P]]			; CHECK-NEXT: call void @llvm.assume(i1 [[CMP]])
	; CHECK-NEXT: [[TMP1:%.*]] = select i1 [[TMP0]], i32 [[NEG]], i32 [[P]]			; CHECK-NEXT: ret i32 [[P]]
	; CHECK-NEXT: ret i32 [[TMP1]]
	;			;
	entry:			entry:
	%cmp = icmp sle i32 %p, 0			%cmp = icmp sle i32 %p, 0
	br i1 %cmp, label %if.then, label %if.end			br i1 %cmp, label %if.then, label %if.end

	if.then:			if.then:
	unreachable			unreachable

	if.end:			if.end:
	%call = call i32 @abs(i32 %p)			%call = call i32 @abs(i32 %p)
	ret i32 %call			ret i32 %call
	}			}


	define i32 @assume2(i32 %p) {			define i32 @assume2(i32 %p) {
	; CHECK-LABEL: @assume2(			; CHECK-LABEL: @assume2(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[TMP0:%.]] = icmp slt i32 [[P:%.]], 0			; CHECK-NEXT: [[CMP:%.]] = icmp sgt i32 [[P:%.]], 0
	; CHECK-NEXT: [[NEG:%.*]] = sub nsw i32 0, [[P]]			; CHECK-NEXT: call void @llvm.assume(i1 [[CMP]])
	; CHECK-NEXT: [[TMP1:%.*]] = select i1 [[TMP0]], i32 [[NEG]], i32 [[P]]			; CHECK-NEXT: ret i32 [[P]]
	; CHECK-NEXT: ret i32 [[TMP1]]
	;			;
	entry:			entry:
	%cmp = icmp sgt i32 %p, 0			%cmp = icmp sgt i32 %p, 0
	br i1 %cmp, label %if.then, label %if.else			br i1 %cmp, label %if.then, label %if.else

	if.then:			if.then:
	br label %if.end			br label %if.end

	Show All 9 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] Added condition assumption for unreachable blocksClosedPublic

Details

Diff Detail

Event Timeline