This is an archive of the discontinued LLVM Phabricator instance.

[Dominators] Fix and optimize edge insertion of depth-based search
ClosedPublic

Authored by MaskRay on Feb 18 2019, 6:39 AM.

Details

Summary

After (x,y) is inserted, depth-based search finds all affected v that satisfies:

depth(nca(x,y))+1 < depth(v) && there exists a path P from y to v where every w on P satisfies depth(v) <= depth(w)

This reduces to a widest path problem (maximizing the weight of the
minimum vertex in the path) which can be solved by a modified version of
Dijkstra with a bucket queue (Depth-based search as mentioned in the paper).

The algorithm visits vertices in decreasing order of bucket number.
However, the current code misused priority_queue to extract them in
increasing order. I cannot think of a failing scenario but it surely may
process vertices more than once due to the local usage of Processed.

This patch fixes this bug and simplifies/optimizes the code a bit. Also
add more comments.

Diff Detail

Event Timeline

MaskRay created this revision.Feb 18 2019, 6:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 18 2019, 6:39 AM
kuhar added a comment.Feb 18 2019, 8:15 AM

@MaskRay, awesome, thanks for looking into this.

Does it affect the number of visited nodes and number of calls to VisitInsertion?

depth(nca(x,y))+1 < depth(v) && there exists a path P from y to v where every w on P satisfies depth(v) <= depth(w)

This reduces to a widest path problem (maximizing the weight of the
minimum vertex in the path) which can be solved by a modified version of
BFS with a bucket queue (Depth-based search as mentioned in the paper).

It would be great to also mention that in the implementation -- from what I remember, the paper doesn't mention the widest path problem.

Additionally, this change should be tested extensively -- domtree bugs tend to often be very difficult to discover.

If you are having fun playing with the incremental updater, I think there's much more to squeeze out of deletions -- the implementation cut corners in many cases, while deletions are known to be a bottleneck on some bitcode in JumpThreading and other loop transforms.

include/llvm/Support/GenericDomTreeConstruction.h
754

Is it possible to give Stack a less generic name, or comment on what it corresponds to in the paper?

824–825

nit: while (true)

826

I'd prefer not to reassign(scalar) function arguments, mostly for readability and debugability.

MaskRay updated this revision to Diff 187293.Feb 18 2019, 8:24 PM

Refactor more and add comments

MaskRay marked 3 inline comments as done.Feb 18 2019, 8:25 PM
MaskRay added inline comments.
include/llvm/Support/GenericDomTreeConstruction.h
826

I think this function actually harms readability. I deleted it.

MaskRay marked 3 inline comments as done.Feb 18 2019, 8:33 PM

I refactored the code a bit more. I found the split function actually made me hard to follow the logic so I just deleted it.

The current patches passes ninja check-all. Also checked llvm-as.bc (generated by gllvm) and sqlite3.bc

sqlite-autoconf-3270100
clang -Xclang -disable-llvm-passes -c -emit-llvm sqlite3.c # get sqlite.bc

Compared opt -passes='default<O3>,function(print<domtree>)' -disable-output < sqlite3.bc => no difference.

cmake -GNinja -H. -BGLLVM -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gclang -DCMAKE_CXX_COMPILER=gclang++ -DCMAKE_C_FLAGS='-Xclang -disable-llvm-passes' -DCMAKE_CXX_FLAGS='-Xclang -disable-llvm-passes'

Compared`opt -passes='default<O3>,function(print<domtree>) -disable-output < ~/llvm/GLLVM/bin/llvm-as.bc` (39MiB) => no difference

include/llvm/Support/GenericDomTreeConstruction.h
754

Changed it to UnaffectedOnCurrentLevel

MaskRay updated this revision to Diff 187298.Feb 18 2019, 8:48 PM
MaskRay marked an inline comment as done.
MaskRay edited the summary of this revision. (Show Details)

Change the comments a bit

kuhar accepted this revision.Feb 18 2019, 8:56 PM

Thanks for the fixes and explanations. Seems good to me now.

This revision is now accepted and ready to land.Feb 18 2019, 8:56 PM
MaskRay updated this revision to Diff 187300.Feb 18 2019, 9:07 PM
MaskRay edited the summary of this revision. (Show Details)

Add more comments: clarify if (SuccLevel <= NCDLevel + 1 || !II.Visited.insert(SuccTN).second)

This revision was automatically updated to reflect the committed changes.