Page MenuHomePhabricator

[TargetPassConfig] Add CanonicalizeFreezeInLoops before LSR
ClosedPublic

Authored by aqjune on Apr 5 2020, 11:26 PM.

Details

Diff Detail

Event Timeline

aqjune created this revision.Apr 5 2020, 11:26 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2020, 11:26 PM

This deserves a PhaseOrdering test (going from the most underoptimized but minimized IR, -O3)

aqjune updated this revision to Diff 255364.Apr 6 2020, 8:50 AM

Add a test with llc -O3 showing that freeze does not affect codegen

aqjune updated this revision to Diff 255370.Apr 6 2020, 9:02 AM

Update the test so it diverges without CanonicalizeFreezeInLoops

Harbormaster completed remote builds in B51967: Diff 255370.
nikic added a subscriber: nikic.May 7 2020, 1:05 PM
nikic added inline comments.
llvm/test/Transforms/CanonicalizeFreezeInLoops/aarch64.ll
4

Probably needs REQUIRES?

aqjune marked an inline comment as done.May 7 2020, 1:43 PM
aqjune added inline comments.
llvm/test/Transforms/CanonicalizeFreezeInLoops/aarch64.ll
4

I'm not familiar with REQUIRES: directive, would this work?

; REQUIRES: arm-registered-target
aqjune marked an inline comment as not done.May 7 2020, 1:44 PM
xbolva00 added inline comments.
llvm/test/Transforms/CanonicalizeFreezeInLoops/aarch64.ll
1

Precommit testcase?

nikic added inline comments.May 7 2020, 2:13 PM
llvm/test/Transforms/CanonicalizeFreezeInLoops/aarch64.ll
4

I believe for this triple it should be aarch64-registered-target.

aqjune updated this revision to Diff 262805.May 7 2020, 7:36 PM

Add REQUIRES

aqjune marked an inline comment as done.May 7 2020, 7:38 PM
aqjune added inline comments.
llvm/test/Transforms/CanonicalizeFreezeInLoops/aarch64.ll
1

Once D77523 is accepted, I'll precommit this test.

aqjune updated this revision to Diff 265404.May 20 2020, 6:24 PM

Precommit test

As the main patch is landed, probably it is okay to land this too?
If there is a non-trivial regression, I can temporarily revert this and work on it.

ping

I evaluated the performance impact on linking time by compiling CTMark with -O3 + LTO enabled, and the result was as follows:

(unit: sec.)
			bed7884	+D77524	speedup
7zip			47.34	47.29	0.12%
bullet			9.83	9.62	2.28%
clamscan		25.42	25.44	-0.06%
SPASS			19.91	19.93	-0.12%
consumer-typeset	19.12	19.24	-0.64%
kimwitu++		15.19	15.06	0.88%
lencod			39.91	40.03	-0.30%
mafft			8.77	8.77	0.05%
sqlite3			25.07	25.33	-1.02%
tramp3d-v4		36.23	36.21	0.06%

The geometric mean of speedup is -0.13%, which seems to be within an error, considering that compilation time (without linking time) showed 0.1% average speedup which should be 0% in theory because LTO is enabled.

As the validity of this patch has been discussed at D77523, I think it is okay to add the pass before LSR.

nikic accepted this revision.May 27 2020, 1:05 PM

LGTM

This revision is now accepted and ready to land.May 27 2020, 1:05 PM
This revision was automatically updated to reflect the committed changes.

@aqjune thanks again for working on this pass. I can confirm that the regression in mcf we originally reported on D76483 is now resolved!

Glad to hear that! :)

https://bugs.llvm.org/show_bug.cgi?id=50440 was filed as a response to this. Since it doesn't look like we'll be able to fix forward quickly and clang 12.0.1 is about to release, I've filed https://bugs.llvm.org/show_bug.cgi?id=50573 to revert this from the main and release/12.x branches.