This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/Utils/
-
Transforms/
-
Utils/
1/1
BasicBlockUtils.cpp

Differential D62981

[DomTreeUpdater] Add all insert before all delete updates to reduce compile time.
ClosedPublic

Authored by asbirlea on Jun 6 2019, 1:53 PM.

Download Raw Diff

Details

Reviewers

kuhar
NutshellySima
mstorsjo

Commits

rGeaea538d18c1: [DomTreeUpdater] Add all insert before all delete updates to reduce compile…
rL362839: [DomTreeUpdater] Add all insert before all delete updates to reduce compile…

Summary

The cleanup in D62751 introduced a compile-time regression due to the way DT updates are performed.
Add all insert edges then all delete edges in DTU to match the previous compile time.
Compile time on the test provided by @mstorsjo before and after this patch on my machine:
113.046s vs 35.649s
Repro: clang -target x86_64-w64-mingw32 -c -O3 glew-preproc.c; on https://martin.st/temp/glew-preproc.c.

Diff Detail

Repository

rL LLVM

Build Status

Buildable 33071
Build 33070: arc lint + arc unit

Event Timeline

asbirlea created this revision.Jun 6 2019, 1:53 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 6 2019, 1:53 PM

Herald added a subscriber: jlebar. · View Herald Transcript

Harbormaster completed remote builds in B33027: Diff 203442.Jun 6 2019, 1:53 PM

I don't know this area so I can't comment on it from that perspective, but it does indeed speed up my case, to even faster than it was before the regression. On my system, it originally took 75 s to compile, 220 s after the regression, and now 62 s with this patch. So looking good in that aspect at least!

Thanks for the patch! I believe that the modification here fulfills the precondition of calling mutation APIs of the DomTreeUpdater.
B.T.W., After seeing this patch, I recalled a not-merged patch, D54730, which has the same motivation and similar modifications.
I think "sorting updates so that insertions always happen before deletions" needs to be analyzed case-by-case, as there isn't enough evidence that the updating process will always be faster that way.

lib/Transforms/Utils/BasicBlockUtils.cpp
215–221	I would like the comment here explaining the order of updates matters the performance. :)

asbirlea mentioned this in D54730: [DomTree] Fix order of domtree updates in MergeBlockIntoPredecessor..Jun 7 2019, 9:33 AM

Add detalied comment.

Harbormaster completed remote builds in B33071: Diff 203588.Jun 7 2019, 11:24 AM

Thanks for the changes. LGTM.

This revision is now accepted and ready to land.Jun 7 2019, 11:58 AM

Closed by commit rL362839: [DomTreeUpdater] Add all insert before all delete updates to reduce compile… (authored by asbirlea). · Explain WhyJun 7 2019, 1:43 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Transforms/

Utils/

BasicBlockUtils.cpp

14 lines

Diff 203588

lib/Transforms/Utils/BasicBlockUtils.cpp

Show First 20 Lines • Show All 206 Lines • ▼ Show 20 Lines	if (isa<PHINode>(BB->front())) {
FoldSingleEntryPHINodes(BB, MemDep);		FoldSingleEntryPHINodes(BB, MemDep);
}		}

// DTU update: Collect all the edges that exit BB.		// DTU update: Collect all the edges that exit BB.
// These dominator edges will be redirected from Pred.		// These dominator edges will be redirected from Pred.
std::vector<DominatorTree::UpdateType> Updates;		std::vector<DominatorTree::UpdateType> Updates;
if (DTU) {		if (DTU) {
Updates.reserve(1 + (2 * succ_size(BB)));		Updates.reserve(1 + (2 * succ_size(BB)));
Updates.push_back({DominatorTree::Delete, PredBB, BB});		// Add insert edges first. Experimentally, for the particular case of two
for (auto I = succ_begin(BB), E = succ_end(BB); I != E; ++I) {		// blocks that can be merged, with a single successor and single predecessor
Updates.push_back({DominatorTree::Delete, BB, *I});		// respectively, it is beneficial to have all insert updates first. Deleting
		// edges first may lead to unreachable blocks, followed by inserting edges
		// making the blocks reachable again. Such DT updates lead to high compile
		// times. We add inserts before deletes here to reduce compile time.
		for (auto I = succ_begin(BB), E = succ_end(BB); I != E; ++I)
		NutshellySimaUnsubmitted Done Reply Inline Actions I would like the comment here explaining the order of updates matters the performance. :) NutshellySima: I would like the comment here explaining the order of updates matters the performance. :)
// This successor of BB may already have PredBB as a predecessor.		// This successor of BB may already have PredBB as a predecessor.
if (llvm::find(successors(PredBB), *I) == succ_end(PredBB))		if (llvm::find(successors(PredBB), *I) == succ_end(PredBB))
Updates.push_back({DominatorTree::Insert, PredBB, *I});		Updates.push_back({DominatorTree::Insert, PredBB, *I});
}		for (auto I = succ_begin(BB), E = succ_end(BB); I != E; ++I)
		Updates.push_back({DominatorTree::Delete, BB, *I});
		Updates.push_back({DominatorTree::Delete, PredBB, BB});
}		}

if (MSSAU)		if (MSSAU)
MSSAU->moveAllAfterMergeBlocks(BB, PredBB, &*(BB->begin()));		MSSAU->moveAllAfterMergeBlocks(BB, PredBB, &*(BB->begin()));

// Delete the unconditional branch from the predecessor...		// Delete the unconditional branch from the predecessor...
PredBB->getInstList().pop_back();		PredBB->getInstList().pop_back();

▲ Show 20 Lines • Show All 693 Lines • Show Last 20 Lines