Page MenuHomePhabricator

[ADCE] Add code to remove dead branches
ClosedPublic

Authored by david2050 on Sep 26 2016, 8:02 AM.

Details

Summary

This is last in of a series of patches to evolve ADCE.cpp to support
removing of unnecessary control flow.

This patch adds the code to update the control and data flow graphs
to remove the dead control flow.

Also update unit tests to test the capability to remove dead,
may-be-infinite loop which is enabled by the switch
-adce-remove-loops.

Previous patches:

D23824 [ADCE] Add handling of PHI nodes when removing control flow
D23559 [ADCE] Add control dependence computation
D23225 [ADCE] Modify data structures to support removing control flow
D23065 [ADCE] Refactor anticipating new functionality (NFC)
D23102 [ADCE] Refactoring for new functionality (NFC)

Diff Detail

Repository
rL LLVM

Event Timeline

david2050 updated this revision to Diff 72486.Sep 26 2016, 8:02 AM
david2050 retitled this revision from to [ADCE] Add code to remove dead branches.
david2050 updated this object.
david2050 added subscribers: twoh, freik.
dberlin added inline comments.Sep 28 2016, 8:22 AM
lib/Transforms/Scalar/ADCE.cpp
217 ↗(On Diff #72486)

I don't think this does what you think.
The order here is not actually guaranteed to be in any particular order, AFAIK.

It happens most things produce blocks in lexical order, but this is not guaranteed (again, AFAIK)

Thus, it's entirely possible you see a successor before a predecessor, which you will incorrectly mark a a back edge.
Please just use a standard DFS approach that is guaranteed correct.

228 ↗(On Diff #72486)

This isn't correct. What if you have two backedges going out of a single block?

235 ↗(On Diff #72486)

We should just fix IDF calculator in the reverse case?

In any case, this is just a reverse DFS (IE DFS over Inverse<BasicBlock>) from all blocks with no successors (LLVM does not hvae a single return block, you can't really say "the return of the function").

Please just do that.

248 ↗(On Diff #72486)

This has the same bug as above :)

610 ↗(On Diff #72486)

Is the above really necessary?

The standard way to do this, AFAIK, is to mark the useful block set while doing marking, and then just walking up the PDT to find the nearest block marked useful for each dead branch.
Replace all uses of the bb with that block using RAUW.
This should update phi nodes, since I believe the blocks in phi nodes are still considered uses.

david2050 added inline comments.Oct 1 2016, 8:36 AM
lib/Transforms/Scalar/ADCE.cpp
217 ↗(On Diff #72486)

This seems super simple and conservative in the sense it will catch all loops. The variant of this code before this review cycle started used po_iterator but there no measurable benefit but quite a bit more code.

228 ↗(On Diff #72486)

Still can only mark the terminator of that block live once.

235 ↗(On Diff #72486)

I think you are proposing to indentify all blocks where there is a path to function return. Given that we we can then infer those blocks which do not reach a return. But this seems to be exactly the information already available from the post-dominator tree.

248 ↗(On Diff #72486)

I don't think so: marking the terminator of the source block not the successor block.

david2050 added inline comments.Oct 1 2016, 8:40 AM
lib/Transforms/Scalar/ADCE.cpp
610 ↗(On Diff #72486)

I don't understand this approach. Given a live branch (x,y) where y is dead and z is the live post-dominator of y, we can't just replace uses of 'y' because the 'y' may not be directly referenced in phi nodes in 'z' which only will old references to its predecessors. Also, 'y' could have multiple branches into it which are dead.

dberlin added inline comments.Oct 1 2016, 2:05 PM
lib/Transforms/Scalar/ADCE.cpp
217 ↗(On Diff #72486)

So, here are two of identical code, laid out in the IR differently, where we will do different things, and should not.

A
jump C
B
jump D
C
jump B

A
jump C
C
jump B
B
jump D

In the first, the visitation order will be A, B, C.
You will mark A as seen.
No successors seen, nothing marked.
You will now mark B as visited
No successors seen, nothing marked
You will now mark C as visited
You will see that you have marked B as visited
You will incorrectly think this is a back edge.

In the second, the visitation order, will be A, C, B.
You will mark A as seen.
No successors seen, nothing marked.
You will mark C as seen
No successors seen, nothing marked.
You will mark B as seen
No successors seen, nothing marked

228 ↗(On Diff #72486)

yes, i read it wrong, sorry!

248 ↗(On Diff #72486)

yes, i read it wrong, sorry!

david2050 updated this revision to Diff 73820.Oct 6 2016, 10:32 AM

rebase and update to new df_iterator interface

david2050 marked 6 inline comments as done.Oct 6 2016, 10:33 AM
mehdi_amini edited edge metadata.Nov 2 2016, 10:52 AM

I had started reviewing after you previous ping last week but couldn't finish because of the conference preparation, so just some piece of wip feedback inline.

lib/Transforms/Scalar/ADCE.cpp
232 ↗(On Diff #73820)

OK, I was wondering how the "completed" stuff worked when you introduced it, I see now :)

243 ↗(On Diff #73820)

I think the boilerplate can be reduced:

for (auto *BB : make_range(Iterator::begin(&F.getEntryBlock(), State),
                                               Iterator::end(&F.getEntryBlock(), State)) {
   ....

The repetition of &F.getEntryBlock(), State isn't great, upgrading iterator_range<df_iterator<T>> depth_first(const T& G) and df_begin and df_end to accept the extra State parameter should allow to write:

for (auto *BB : depth_first(&F.getEntryBlock(), State)) {
   ....
david2050 updated this revision to Diff 76904.Nov 3 2016, 5:40 PM
david2050 marked an inline comment as done.
david2050 edited edge metadata.

Use depth_first_ext() template

ping

lib/Transforms/Scalar/ADCE.cpp
243 ↗(On Diff #73820)

Thanks, the existing depth_first_ext applies.

Sorry for the delay, I keep getting interrupted in the middle of getting this code into my head, and paging out has a real cost when getting back to it. Especially as I don't really know the algorithm, I have to infer what you're doing from the implementation (which isn't a bad way to judge of the clarity of the code).

Anyway, a few inline comments.

lib/Transforms/Scalar/ADCE.cpp
240 ↗(On Diff #76904)

Remove this empty line?

496 ↗(On Diff #76904)

This is quite complex to follow: the NewBranches is some state that is kept on the side and updated by makeUnconditional (transitively called by updateDeadRegions), and used here as a cleanup. It is never cleared BTW.
Maybe declaring NewBranches locally to this function and passing it by as an output parameter to updateDeadRegions()?

571 ↗(On Diff #76904)

s/unprocsesed/unprocessed/

580 ↗(On Diff #76904)

Parameter should be SmallPtrSetImpl<BasicBlock *> *VisitedBlocks. The template with the size is only used where you need to actually create one.

(Also I'd use a reference instead of pointer)

611 ↗(On Diff #76904)

auto * for pointers
(Many other occurrences)

621 ↗(On Diff #76904)

How do we get this invariant that a region has a single live PDOM?
(If you can point me to the relevant code)

628 ↗(On Diff #76904)

Could be compressed to a single else:

else  {
  assert(!LivePDOM || LivePDOM == SuccBB);
  LivePDOM = SuccBB;
}

Also we usually try to add a message with assertions when possible.

665 ↗(On Diff #76904)

To make sure I understand correctly: if the terminator was a conditional branch with each target BB in different dead region, then this terminator shouldn't be marked as dead?

667 ↗(On Diff #76904)

/me scrolling but don't see the example...

610 ↗(On Diff #72486)

I'm not sure I understood what @dberlin referred to, but as I read it it should be possible to just process DeadBlocks without computing the dead region, and only manipulate Phi in the live PDOM if the live PDOM is a direct successor of the current block. I think this should get the same result in term of Phi, but does not seems less costly than computing the region.
Also I'm not sure if you can rewrite the branches from the live predecessors into the dead region to jump directly to the live PDOM without computing the region (but with the invariant you are already relying on to get a dead region with a unique live PDOM it may be possible).

david2050 marked 10 inline comments as done.Nov 17 2016, 12:14 PM
david2050 added inline comments.
lib/Transforms/Scalar/ADCE.cpp
621 ↗(On Diff #76904)

I will add a comment:

// (A block with a dead branch which reaches a live block that
//  is not a post-dominator would be a control dependence source
//  and hence its branch must be live)
665 ↗(On Diff #76904)

This is a live block which has whose terminator is dead the implication is that all paths from this branch must reach LivePDOM (directly or through blocks with dead branches). We replace this branch with an unconditional branch.

mehdi_amini added inline comments.Nov 17 2016, 1:54 PM
lib/Transforms/Scalar/ADCE.cpp
665 ↗(On Diff #76904)

My question was about another invariant at this stage:

A
| \
B C
| |
D E

If B and C are dead, but A, D and E are live.
A's terminator branch to either B or C, which are both dead. However B and C aren't part of the same dead region.
You're operating here under the assumption that A's terminator has be marked as live somehow, right?

david2050 updated this revision to Diff 78599.Nov 18 2016, 4:36 PM
david2050 marked 2 inline comments as done.

reduce scope of NewBranches, fix formatting

david2050 updated this revision to Diff 78600.Nov 18 2016, 4:36 PM

reduce scope of NewBranches, fix formatting

david2050 added inline comments.Nov 21 2016, 2:32 PM
lib/Transforms/Scalar/ADCE.cpp
665 ↗(On Diff #76904)

Yes, if "D" is live then, as expressed, "A" is a control-dependence source for "D" (and for all non-A nodes listed) and as such will be marked live. "A" will be in the iterated dominance frontier of the "D" in the reverse control flow graph. If the terminator is not live then "D" and "E" must be the same (and in the code labeled LivePDOM).

Regarding the question of simpler update to the control flow, I reviewed two references from @dberlin:

(1) https://www.clear.rice.edu/comp512/Lectures/10Dead-Clean-SCCP.pdf

In this work, the PHI nodes contain only references to named definitions and do not include a reference to the predecessor blocks associated with reaching paths from those definitions.

(2) https://github.com/apple/swift/blob/master/lib/SILOptimizer/Transforms/DeadCodeElimination.cpp

This swift variant makes no explicit reference to Phi nodes and does neither updates them nor manages explicit predecessors as far as I can tell.

Does anyone have another pointer to compiler which have SSA-based dead code elimination including control flow which might guide to a similar approach the code here?
Thanks!

david2050 updated this revision to Diff 80516.Dec 6 2016, 5:26 PM

Use reverse CFG numbering to alter control flow

david2050 marked 6 inline comments as done.Dec 6 2016, 5:28 PM
dberlin added inline comments.Dec 6 2016, 10:41 PM
lib/Transforms/Scalar/ADCE.cpp
601 ↗(On Diff #80516)

I think you should just be able to output the PO iterator on Inverse<F> into an array, no?

I would just add

template <class T>
po_iterator<T> inverse_po_begin(const T &G) { return po_iterator<Inverse<T>>::begin(G); }
template <class T>
po_iterator<T> inverse_po_end  (const T &G) { return po_iterator<Inverse<T>>::end(G); }

template <class T> iterator_range<po_iterator<T>> inverse_post_order(const T &G) {
  return make_range(inverse_po_begin(G), inverse_po_end(G));
}

To PostOrderIterator.h
Then you can just use a nice range loop to fill in the numbers wherever you want them.

david2050 updated this revision to Diff 80609.Dec 7 2016, 9:07 AM

use existing inverse_post_order

david2050 added inline comments.Dec 7 2016, 9:08 AM
lib/Transforms/Scalar/ADCE.cpp
601 ↗(On Diff #80516)

Thanks for reminding me. These macros already are defined in in the header. We still need the loop to find all nodes without successors right, because there could be multiple returns/

david2050 updated this revision to Diff 80611.Dec 7 2016, 9:10 AM

remove unused declarations

Skimming through it it LGTM as well. Thanks!

mehdi_amini accepted this revision.Dec 13 2016, 8:42 AM
mehdi_amini edited edge metadata.
This revision is now accepted and ready to land.Dec 13 2016, 8:42 AM
This revision was automatically updated to reflect the committed changes.