This is an archive of the discontinued LLVM Phabricator instance.

[CGP] Build the DominatorTree lazily
ClosedPublic

Authored by tejohnson on Mar 22 2019, 7:09 AM.

Details

Summary

In r355512 CGP was changed to build the DominatorTree only once per
function traversal, to avoid repeatedly building it each time it was
accessed. This solved one compile time issue but introduced another. In
the second case, we now were building the DT unnecessarily many times
when we performed many function traversals (i.e. more than once per
function when running CGP because of changes made each time).

Change to saving the DT in the CodeGenPrepare object, and building it
lazily when needed. It is reset whenever we need to rebuild it.

The case that exposed the issue there are 617 functions, and we walk
them (i.e. execute the "while (MadeChange)" loop in runOnFunction) a
total of 12083 times (so previously we were building the DT 12083
times). With this patch we only build the DT 844 times (average of 1.37
times per function). We dropped the total time to compile this file from
538.11s without this patch to 339.63s with it.

There is still an issue as CGP is taking much longer than all other
passes even with this patch, and before a recent compiler release cut at
r355392 the total time to this compile was only 97 sec with a huge
reduction in CGP time. I suspect that one of the other recent changes to
CGP led to iterating each function many more times on average, but I
need to do some more investigation.

Diff Detail

Event Timeline

tejohnson created this revision.Mar 22 2019, 7:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2019, 7:09 AM
Herald added a subscriber: jdoerfert. · View Herald Transcript

Thanks for working on this!

spatel added inline comments.Mar 22 2019, 4:13 PM
lib/CodeGen/CodeGenPrepare.cpp
300–301

We should add the reasoning for this choice in the code comment.
Something like:
Creating a dominator may be expensive, so we only do this lazily and update when required.

390–395

I don't like the idea of passing a shadowed name version of the class member as a param (and we already found a bug related to that). Can we make these member functions as a preliminary NFC commit?

tejohnson marked 4 inline comments as done.Mar 24 2019, 8:27 AM
tejohnson added inline comments.
lib/CodeGen/CodeGenPrepare.cpp
300–301

Modified comment.

390–395

Good catch on the TLI and TL member variables! I fixed that part and committed the restructuring in r356857.

tejohnson updated this revision to Diff 192034.Mar 24 2019, 8:28 AM
tejohnson marked 2 inline comments as done.

Address comments, sync in NFC change r356857.

spatel accepted this revision.Mar 25 2019, 8:54 AM

LGTM

This revision is now accepted and ready to land.Mar 25 2019, 8:54 AM
This revision was automatically updated to reflect the committed changes.
fhahn added a subscriber: fhahn.Mar 27 2019, 11:02 AM
fhahn added inline comments.
llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp
345 ↗(On Diff #192171)

Would it be worth trying the fetch the DT from the analysis cache for the first time?

Also I am not sure about how common CFG changes are here? If we do not make too many changes, I suppose we could save some more time by lazily updating the DT using the update interface.

spatel added inline comments.Mar 27 2019, 1:21 PM
llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp
345 ↗(On Diff #192171)

Partly addressed in a newer patch from today:
D59889

tejohnson marked an inline comment as done.Mar 29 2019, 8:55 AM
tejohnson added inline comments.
llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp
345 ↗(On Diff #192171)

Would it be worth trying the fetch the DT from the analysis cache for the first time?
Also I am not sure about how common CFG changes are here? If we do not make too many changes, I suppose we could save some more time by lazily updating the DT using the update interface.

Agree these are better long term fixes for this code (which as @spatel and I discussed recently on D59139 has gotten really unwieldy and could do with an overhaul). My familiarity with this pass is very limited though, I have just been doing some fixes to recover from recent compile-time issues.