This is an archive of the discontinued LLVM Phabricator instance.

Do not widen load for different variable in GVN.
ClosedPublic

Authored by danielcdh on Aug 31 2016, 11:48 AM.

Details

Summary

Widening load in GVN is too early because it will block other optimizations like PRE, LICM.

https://llvm.org/bugs/show_bug.cgi?id=29110

The SPECCPU2006 benchmark impact of this patch:

Reference: o2_nopatch
(1): o2_patched

Benchmark             Base:Reference   (1)

spec/2006/fp/C++/444.namd 25.2 -0.08%
spec/2006/fp/C++/447.dealII 45.92 +1.05%
spec/2006/fp/C++/450.soplex 41.7 -0.26%
spec/2006/fp/C++/453.povray 35.65 +1.68%
spec/2006/fp/C/433.milc 23.79 +0.42%
spec/2006/fp/C/470.lbm 41.88 -1.12%
spec/2006/fp/C/482.sphinx3 47.94 +1.67%
spec/2006/int/C++/471.omnetpp 22.46 -0.36%
spec/2006/int/C++/473.astar 21.19 +0.24%
spec/2006/int/C++/483.xalancbmk 36.09 -0.11%
spec/2006/int/C/400.perlbench 33.28 +1.35%
spec/2006/int/C/401.bzip2 22.76 -0.04%
spec/2006/int/C/403.gcc 32.36 +0.12%
spec/2006/int/C/429.mcf 41.04 -0.41%
spec/2006/int/C/445.gobmk 26.94 +0.04%
spec/2006/int/C/456.hmmer 24.5 -0.20%
spec/2006/int/C/458.sjeng 28 -0.46%
spec/2006/int/C/462.libquantum 55.25 +0.27%
spec/2006/int/C/464.h264ref 45.87 +0.72%

geometric mean +0.23%

For most benchmarks, it's a wash, but we do see stable improvements on some benchmarks, e.g. 447,453,482,400.

Diff Detail

Event Timeline

danielcdh updated this revision to Diff 69888.Aug 31 2016, 11:48 AM
danielcdh retitled this revision from to Do not widen load for different variable in GVN..
danielcdh updated this object.
danielcdh added reviewers: davidxl, hfinkel, dberlin.
dberlin edited edge metadata.Aug 31 2016, 12:40 PM

I believe you can simply disable memdep returning these as clobbers to get the same result.

IE remove lines 583-588, and then simplify the if to be:

if (isLoad && R == NoAlias)
continue;

I believe you can simply disable memdep returning these as clobbers to get the same result.

IE remove lines 583-588, and then simplify the if to be:

if (isLoad && R == NoAlias)
continue;

You mean completly remove isLoadLoadClobberIfExtendedToFullWidth in MemoryDependenceAnalysis.cpp?

Then the corresponding code in GVN becomes dead code, and we can completely remove AnalyzeLoadFromClobberingLoad?

The reason I kept the code was because one test in test/Transforms/GVN/PRE/atomic.ll can still benefit from it:

; CHECK-LABEL: @test13(
; atomic to non-atomic forwarding is legal
define i32 @test13(i32* %P1) {

%a = load atomic i32, i32* %P1 seq_cst, align 4
%b = load i32, i32* %P1
%res = sub i32 %a, %b
ret i32 %res
; CHECK: load atomic i32, i32* %P1
; CHECK: ret i32 0

}

But seems this case should not be common in real code, so maybe it's okay to remove the code as well as this test?

danielcdh updated this revision to Diff 69903.Aug 31 2016, 2:28 PM
danielcdh edited edge metadata.

cleanup

junbuml added a subscriber: junbuml.Sep 1 2016, 6:55 AM
gberry added a subscriber: gberry.Sep 6 2016, 8:21 AM
reames edited edge metadata.Sep 6 2016, 3:29 PM

The reason I kept the code was because one test in test/Transforms/GVN/PRE/atomic.ll can still benefit from it:

; CHECK-LABEL: @test13(
; atomic to non-atomic forwarding is legal
define i32 @test13(i32* %P1) {

%a = load atomic i32, i32* %P1 seq_cst, align 4
%b = load i32, i32* %P1
%res = sub i32 %a, %b
ret i32 %res
; CHECK: load atomic i32, i32* %P1
; CHECK: ret i32 0

}

But seems this case should not be common in real code, so maybe it's okay to remove the code as well as this test?

The context for this comment seems to be lost in the review thread. Which piece of code was this comment in reference to? This test shouldn't be effected by a widening change. If it is, that's more than a bit suspicious.

danielcdh updated this revision to Diff 70487.Sep 6 2016, 4:23 PM
danielcdh edited edge metadata.

update to fix atomic support

reames added a comment.Sep 6 2016, 4:35 PM

Please revert the GVN changes entirely and revisit the tests. I think several of them are stale.

lib/Transforms/Scalar/GVN.cpp
1260 ↗(On Diff #70487)

I think this comment may be misleading. Let me rewrite the example in modern IR.
%p = ...
load i32, i32* %p
%p2 = bitcast i32* to i8*
%p3 = getelementptr i8, i8* %p2, 1
load i8, i8* %p

The two loads overlap without needing to widen either load.

1272 ↗(On Diff #70487)

Leave this as it was.

danielcdh updated this revision to Diff 70489.Sep 6 2016, 4:38 PM

completely revert gvn

reames added a comment.Sep 6 2016, 4:38 PM

Once we land this, remind me to come back and prune some dead cases from the GVN code. I'd like to do that separately as the code changes will be subtle.

In the latest patch, I reverted all changes in GVN, only updates the memdep to disable load-widening.

reames added a comment.Sep 6 2016, 4:43 PM

Do you have any performance data here? I'm a bit concerned about the interactions with inlining.

test/Transforms/GVN/PRE/rle.ll
626

Add a comment that says we explicitly choose NOT to widen here and are testing to make sure we don't.

test/Transforms/GVN/big-endian.ll
10–12

Please use CHECK: forms so that we tell what's being done here. Same in other cases.

danielcdh updated this revision to Diff 70491.Sep 6 2016, 4:46 PM

update comment

I have the perf data from the original patch (in the patch description), I don't expect it to change with the updated patch. But I can rerun test and update the result soon.

Updated perf test result:

spec/2006/fp/C++/444.namd 25.2 +0.12%
spec/2006/fp/C++/447.dealII 45.92 +0.72%
spec/2006/fp/C++/450.soplex 41.7 +0.19%
spec/2006/fp/C++/453.povray 35.65 +0.90%
spec/2006/fp/C/433.milc 23.79 +0.08%
spec/2006/fp/C/470.lbm 41.88 +0.14%
spec/2006/fp/C/482.sphinx3 47.94 +0.48%
spec/2006/int/C++/471.omnetpp 22.46 +0.18%
spec/2006/int/C++/473.astar 21.19 -0.61%
spec/2006/int/C++/483.xalancbmk 36.09 -1.66%
spec/2006/int/C/400.perlbench 33.28 +2.28%
spec/2006/int/C/401.bzip2 22.76 +0.26%
spec/2006/int/C/403.gcc 32.36 +0.22%
spec/2006/int/C/429.mcf 41.04 +0.24%
spec/2006/int/C/445.gobmk 26.94 -0.11%
spec/2006/int/C/456.hmmer 24.5 -0.16%
spec/2006/int/C/458.sjeng 28 -0.54%
spec/2006/int/C/462.libquantum 55.25 -0.07%
spec/2006/int/C/464.h264ref 45.87 +1.00%

geometric mean +0.19%

The xalan regression is secondary effect. I compared all top functions, they are all bit-identical except for different function start addresses. It shouldn't be the blocker of this patch.

Does anyone else have additional comments on this change?

If not, could someone help approve this patch?

Thanks!
Dehao

reames accepted this revision.Sep 9 2016, 11:21 AM
reames edited edge metadata.

Danny LGTM'd this in the email thread, but it's not showing up in the phabricator review for some reason. Also, LGTM by me as well.

This revision is now accepted and ready to land.Sep 9 2016, 11:21 AM
danielcdh closed this revision.Sep 9 2016, 11:50 AM