Page MenuHomePhabricator

[hot-cold-split] fix static analysis of cold regions
ClosedPublic

Authored by sebpop on Oct 4 2018, 1:35 PM.

Details

Summary

Make the code of blockEndsInUnreachable to match the function
blockEndsInUnreachable in CodeGen/BranchFolding.cpp. I also have
added a note to make sure the code of this function will not be
modified unless the back-end version is also modified.

An early return before outlining has been added to avoid
outlining the full function body when the first block in the
function is marked cold.

The static analysis of cold code has been amended to avoid
marking the whole function as cold by back-propagation
because the back-propagation would mark blocks with return
statements as cold.

The patch adds debug statements to help discover these problems.

Diff Detail

Event Timeline

sebpop created this revision.Oct 4 2018, 1:35 PM
sebpop added a comment.Oct 4 2018, 1:38 PM

Before this patch we have a 10% regression in sqlite with hot-cold-split pass.
With this patch I now see a 3% speedup on sqlite vs. no hot-cold-split pass.

sebpop updated this revision to Diff 168476.Oct 5 2018, 8:20 AM

Added an early return for outlining if function entry is cold.
Added a check for invoke calls: invoke should not be marked as cold by back propagation.

There are only a few cold regions that are now extracted with the static analysis of hot/cold blocks on a relatively large benchmark.
Through those extracted blocks there are some diamonds that do not contain much code.
We may want to avoid outlining a candidate cold region if the number of instructions in the region is too low.

Hi @sebpop, I have a few comments inline regarding the function determining side effect status.

llvm/lib/Transforms/IPO/HotColdSplitting.cpp
154

Why is an Indirect branch considered a return? The instruction is required to branch to an address of a label within the current function.

I am also wondering about the EH TIs: catchswitch, resume, catchret, cleanupret. Should we just pessimize all EH code with:

if (I->isEHPad())
  return true;
158

Should we consider adding a per-instruction side-effect check?

if (I.mayHaveSideEffects())
  return true;
tejohnson added inline comments.Oct 15 2018, 11:36 AM
llvm/lib/Transforms/IPO/HotColdSplitting.cpp
105

Suggest copying in the explanatory header comments from there.

395

Suggest moving this check outside this loop. I.e. just check if Begin is cold and return early if so.

I will post a patch to fix the comments from @tejohnson.

llvm/lib/Transforms/IPO/HotColdSplitting.cpp
154

Why is an Indirect branch considered a return?

This function is similar to blockEndsInUnreachable which is used to detect cold code for the forward propagation. The documentation of blockEndsInUnreachable says:

Also consider a block that ends in an indirect branch to be a return block, since many targets use plain indirect branches to return.

I am not sure why the back propagation should stop on an instruction which is I->isEHPad.

158

I would say no because

bool mayHaveSideEffects() const { return mayWriteToMemory() || mayThrow(); }

Code that stores to memory or that throws can be cold.

sebpop updated this revision to Diff 169742.Oct 15 2018, 12:50 PM

Fix two comments from @tejohnson.

Are both fixes necessary to fix the issue (the one for back propagation and the one to bail out if the entry block is cold), or is either one sufficient? The patch description only mentions the former.

Are both fixes necessary to fix the issue (the one for back propagation and the one to bail out if the entry block is cold), or is either one sufficient? The patch description only mentions the former.

I am not sure whether the entry block will get marked as cold after the two fixes to the forward and back propagation.
We could try to transform the early return into an assert that the entry block of a function is never marked cold.
I will test this change.

Thanks for your careful review of the patch!

Are both fixes necessary to fix the issue (the one for back propagation and the one to bail out if the entry block is cold), or is either one sufficient? The patch description only mentions the former.

We need both parts of the patch to fix the issue.

I am not sure whether the entry block will get marked as cold after the two fixes to the forward and back propagation.
We could try to transform the early return into an assert that the entry block of a function is never marked cold.
I will test this change.

Static analysis would mark the first block in the next function as cold:

define void @fun() {
entry:

unreachable

}

When the entry block is marked cold,
we need an early return before outlining the full function.

With this patch I now see a 3% speedup on sqlite vs. no hot-cold-split pass.

Awesome! Thanks for improving this!

sebpop edited the summary of this revision. (Show Details)Oct 15 2018, 2:26 PM

With this patch I now see a 3% speedup on sqlite vs. no hot-cold-split pass.

Awesome! Thanks for improving this!

I think the 3% uplift in performance is within the noise level.
At least with this patch I do not see the 10% regression in performance with hot/cold partitioning
which most likely was due to doubling the cost of function calls when the whole function was outlined.

This LGTM, but please wait for @brzycki

This revision was not accepted when it landed; it landed in state Needs Review.Oct 15 2018, 2:45 PM
Closed by commit rL344558: [hot-cold-split] fix static analysis of cold regions (authored by spop, committed by ). · Explain Why
This revision was automatically updated to reflect the committed changes.

With this patch I now see a 3% speedup on sqlite vs. no hot-cold-split pass.

Awesome! Thanks for improving this!

I think the 3% uplift in performance is within the noise level.
At least with this patch I do not see the 10% regression in performance with hot/cold partitioning
which most likely was due to doubling the cost of function calls when the whole function was outlined.

I think hot-cold splitting with function merging can help reduce code-size as well.

orivej added a subscriber: orivej.Oct 15 2018, 3:36 PM
orivej added inline comments.
llvm/trunk/lib/Transforms/IPO/HotColdSplitting.cpp
153 ↗(On Diff #169757)

This line fails to compile with g++ 5.4.0 with:

error: invalid conversion from 'const llvm::Instruction*' to 'const llvm::TerminatorInst*' [-fpermissive]

Did you mean const Instruction *I = BB.getTerminator();.

NoQ added a subscriber: NoQ.Oct 15 2018, 4:54 PM

There seem to be more buildbot issues, eg. http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental/54221/console

: 'RUN: at line 1';   /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/clang-build/bin/opt -hotcoldsplit -S < /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm/test/Transforms/HotColdSplit/split-out-dbg-val-of-arg.ll | /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/clang-build/bin/FileCheck /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm/test/Transforms/HotColdSplit/split-out-dbg-val-of-arg.ll
--
Exit Code: 1

Command Output (stderr):
--
/Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm/test/Transforms/HotColdSplit/split-out-dbg-val-of-arg.ll:3:16: error: CHECK-LABEL: expected string not found in input
; CHECK-LABEL: define {{.*}}@foo_if.end
               ^
<stdin>:1:1: note: scanning from here
; ModuleID = '<stdin>'
^
<stdin>:4:1: note: possible intended match here
define void @foo(i32 %arg1) !dbg !6 {
^
NoQ added a comment.Oct 15 2018, 7:07 PM

This helped, thanks!

Nice work @sebpop. Here's a delayed "LGTM". :)