This is an archive of the discontinued LLVM Phabricator instance.

Rewrite Aggressive Dead Code Elimination
AbandonedPublic

Authored by david2050 on Apr 4 2016, 9:33 AM.

Details

Summary

We noticed that a source of performance difference between llvm and
gcc was in dead code elimination. This diff replaces ADCE with a new
implementation which will remove control flow and, under option, also
remove may-be-infinite loops which are dead. The patch current has
"ADCE_new.h" and "ADCE_new.cpp" to clarify that this is essentially a
full rewrite of the deleted "ADCE.h" and "ADCE.cpp", When ready to
land, I plan to change these names back to the originals.

The new algorithm uses the iterated dominance frontier to
incrementally identify control dependence sources for live operations
to determine when branch operations are live. It adds a new fix-up
phase to correct update phi-nodeswhen control flow changes.

Below are compile time and performance impacts based on Facebook
internal code sets.

Baseline ADCE compared to new ADCE compile time, -O3 removing loops
Comparing compile times for ~400 source files from folly code base

"baseline" is total time in seconds for all files
"net_difference" is increase due to changes
"percent" is increase expressed aa persent of baseline
"samples" is number of distinct tests, these vary in some cases
as runs with high variance in timing are excluded from summary

baseline net_difference percent samples .
28,591.41 165.91 0.58 413 .

Baseline ADCE compared with new ADCE, also removing loops

baseline net_difference percent samples .
28,591.02 215.35 0.75 411 .

Here are the impacts on runtimes for three different internal sets of
benchmarks, again compiled -O3

baseline, percent, samples as above

Without removing loops:

baseline                           total  samples  .
105,846,951,721,178.10  1.000    85      .
  28,317,122,801,690.20  0.998  2
         364,771,688,928.1  0.997    64      .

With removing loops: (adding -mllvm -adce-remove-loops)

baseline                           total  samples  .
105,845,678,704,074.98  1.000   78      .
  29,901,291,181,631.10  0.988  280      .
    332,140,718,056.76     0.982   61      .

Diff Detail

Event Timeline

david2050 updated this revision to Diff 52568.Apr 4 2016, 9:33 AM
david2050 retitled this revision from to Add Aggressive Control Dead Code Elimination.
david2050 updated this object.

(added llvm-commits as subscriber)

freik added a subscriber: freik.Apr 4 2016, 9:54 AM
twoh added a subscriber: twoh.Apr 4 2016, 10:11 AM
david2050 updated this revision to Diff 54104.Apr 18 2016, 1:12 PM

Modified to use IDFCalculator (itself modified to work on the reverse graph)

david2050 updated this object.Apr 18 2016, 1:16 PM
david2050 added reviewers: dberlin, hfinkel.
david2050 added subscribers: MatzeB, bogner.
david2050 updated this revision to Diff 54193.Apr 19 2016, 8:16 AM
david2050 updated this object.

Update to new IDFCalculator

david2050 updated this revision to Diff 54871.Apr 25 2016, 9:41 AM

Update to recent ADCE changes

david2050 updated this revision to Diff 55007.Apr 26 2016, 8:23 AM

Cleanp in response to review comments

zzheng added a subscriber: zzheng.Apr 26 2016, 11:11 AM

Just some drive-by nitpicks on format: acdec.[h|cpp] files' banner and 80 columns. Please comply with coding standard.

include/llvm/Transforms/Scalar.h
96

Exceeds 80 columns.

include/llvm/Transforms/Scalar/ACDCE.h
2 ↗(On Diff #55007)

//===- ACDCE.h - Aggressive CONTROL dead code elimination--------===//

lib/Transforms/Scalar/ACDCE.cpp
2 ↗(On Diff #55007)

Please fix the banner.

Just in case you didn't know David, we have a tool called clang-format that automatically format your code for the LLVM coding standard. LLVM comes also with a git-clang-format script that if you put in your path enables a command git clang-format <range> that format only the lines that are touched by the diff.
(this is an optional tool, you don't *have to* use it, I'm just pointing it for your convenience)

david2050 marked 3 inline comments as done.Apr 26 2016, 1:07 PM

Thanks.

david2050 updated this revision to Diff 55087.Apr 26 2016, 1:51 PM

Fix formatting errors

david2050 updated this revision to Diff 55088.Apr 26 2016, 1:53 PM

and the formatting one in ACDCE.h

david2050 updated this revision to Diff 55239.Apr 27 2016, 8:33 AM

Update two recently added test and fix a bug incorporating changes to
ADCE.cpp

david2050 updated this revision to Diff 55715.Apr 30 2016, 5:47 AM

Bug fix in handling of "hard" phi nodes

majnemer added inline comments.
lib/Transforms/IPO/PassManagerBuilder.cpp
302–303

Please format this appropriately.

lib/Transforms/Scalar/ACDCE.cpp
508–513 ↗(On Diff #55715)

Why are EHPads always considered live?

609 ↗(On Diff #55715)

Please use // comments.

617 ↗(On Diff #55715)

Please start sentences with a capital letter.

617–623 ↗(On Diff #55715)

It would be helpful if you gave a short example of this in IR (as a comment).

620–621 ↗(On Diff #55715)

The formatting got weird here.

645–647 ↗(On Diff #55715)

Some strange indentation made its way here.

731 ↗(On Diff #55715)

Please obey the naming convention.

818 ↗(On Diff #55715)

Please start each sentence with a capital letter and end it with a period.

849 ↗(On Diff #55715)

Ditto.

877–878 ↗(On Diff #55715)

I wouldn't bother with an IRBuilder for a single instruction, BranchInst::Create(Target, PredTerm) would do.

david2050 marked 10 inline comments as done.Jun 6 2016, 1:27 PM

Thanks for the review

lib/Transforms/IPO/PassManagerBuilder.cpp
302–303

I assume this means removing the braces.

lib/Transforms/Scalar/ACDCE.cpp
508–513 ↗(On Diff #55715)

I cribbed the condition from the existing ADCE pass. I assumed that it would be impossible to remove the corresponding invoke so simplest to just keep the operation live.

david2050 updated this revision to Diff 59776.Jun 6 2016, 1:28 PM
david2050 marked an inline comment as done.

Address comments

david2050 updated this revision to Diff 59782.Jun 6 2016, 2:11 PM
  • Merge branch 'master' into acdce-with-idf-for-review
junbuml added a subscriber: junbuml.Jun 7 2016, 6:28 AM

From Chandler by email:

t would be good to at least send a fresh mail to the list with an updated rationale and such.

One thing that I'm having trouble even making out from the patch description in Phab is exactly what numbers you have collected. The formatting makes it really hard to understand what change this actually introduces.

As one high-level comment, why would we want this *and* ADCE to exist? I feel like we don't need 4 or 5 different dead code elimination passes and would benefit from this actually just being a patch to one of the existing ones.

Another high-level comment: if we must have both ADCE and this pass, I find the name "ACDCE" or "Aggressive Control Dead Code Elimination" very confusing. The order of the words seems all wrong. If we need to distinguish between eliminating dead control flows independent of eliminating dead instructions within basic blocks, I would name them "Aggressive Dead Control Flow Elimination" and "Aggressive Dead Code Elimination". And I would consider stripping the "Aggressive" prefix because I doubt that really makes sense these days.

Thanks Chandler

Changes in response shortly.

david2050 updated this revision to Diff 60180.Jun 9 2016, 8:19 AM

Replace the baseline ADCE with this implementation modified to
not build the post-dominator tree and in that case retain all
control flow. This preserves prior behavior for -O1 and -O2 and avoids
the extra cost needed to remove control flow.

david2050 updated this revision to Diff 62366.Jun 30 2016, 8:23 AM

Following Chandler's feedback, modified to simply replace the existing
ADCE rather than having two implementations. Reused baseline names.

Fixed some bugs in handling debug scopes/operations.

david2050 retitled this revision from Add Aggressive Control Dead Code Elimination to Rewrite Aggressive Dead Code Elimination.Jun 30 2016, 8:26 AM
david2050 updated this object.
david2050 updated this object.Jun 30 2016, 8:33 AM

Following Chandler's feedback, modified to simply replace the existing
ADCE rather than having two implementations. Reused baseline names.

Fixed some bugs in handling debug scopes/operations.

Unfortunately, the way this patch is structured, it just shows you deleting the entire ADCE patch, and then adding a new one with no way to tell what the history is. This makes reviewing it really challenging.

I think the best way to go about this is to post small, incremental cleanups and refactorings to the existing ADCE pass until there is a clean design where you can submit a patch that adds control flow deletion (or whatever other logic you want) to the *existing* pass behind a flag.

Does that make sense?

I am sympathetic of course to the view of incremental changes but I don't really think it is effective in this case because so much functionality has to come online in one go. There is a kernel this is common (propagating liveness as a sparse backwards flow problem, lines 488-497) but that is quite small. The next logical step is to defer deleting branches until you know they control something live. This needs to be done in batches to use IDFCalculator efficiently and adds an outer loop. However, once you start having *any* dead control flow you need to deal with the complexities of phi nodes and the attendant problems with changing the control flow graph. There is not really any useful intermediate step,

david2050 abandoned this revision.Apr 12 2018, 11:10 AM