Page MenuHomePhabricator

[DTU] Refine the document of mutation APIs [NFC] (PR40528)
ClosedPublic

Authored by NutshellySima on Feb 7 2019, 2:42 AM.

Details

Summary

It was pointed out in Bug 40528 that it is not clear whether insert/deleteEdge can be used to perform multiple updates and a comment in D57316 reveals that the difference between several ways to update the DominatorTree is confusing.

This patch tries to address issues above.

Diff Detail

Repository
rL LLVM

Event Timeline

NutshellySima created this revision.Feb 7 2019, 2:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 7 2019, 2:42 AM
NutshellySima set the repository for this revision to rG LLVM Github Monorepo.
kuhar added a comment.Feb 7 2019, 7:57 AM

Nice, thanks for working on this!

As a general direction, I think we should try to minimize the amount of documentation about DT updates -- the number of different ways and all their preconditions are confusing.

I'd like to first go forward with this patch, after minor corrections, and then mark insertEdge/deleteEdge as deprecated, migrate their uses to applyUpdates, and later remove it entirely. The, we can note put a note in GenericDomTree saying that for the llvm trees, DTU should be used instead.

llvm/include/llvm/Analysis/DomTreeUpdater.h
93 ↗(On Diff #185734)

I'd add some indentation.

94 ↗(On Diff #185734)

Instead of referring to the DT incremental updater, I'd say that it performs updates immediately.

101 ↗(On Diff #185734)

Instead of listing this as another method, I'd mention that even though GenericDomTree provides these update primitives, it is recommended to use DTU with llvm::DomTree and llvm::PostdomTree. I think the less text here the better.

106 ↗(On Diff #185734)

I'd completely omit the mention of the 'manual' update API -- there are very few legitimate reasons to use it now

127 ↗(On Diff #185734)

I think this is fine with the eager strategy.

141 ↗(On Diff #185734)

update that the DomTreeUpdater or the DominatorTree already acknowledged.

update that has already been applied.

153 ↗(On Diff #185734)

like above

207 ↗(On Diff #185734)

under the

Nice, thanks for working on this!

Agreed, thank you!

I'd like to first go forward with this patch, after minor corrections, and then mark insertEdge/deleteEdge as deprecated, migrate their uses to applyUpdates, and later remove it entirely. Then we can note put a note in GenericDomTree saying that for the llvm trees, DTU should be used instead.

+1
The insertEdge() and deleteEdge() direct calls require so much manual fix-up at the call-sites I'd really prefer to see them go away. It also provides a cleaner interface to DTU as a whole, opening up opportunities to alter the underlying class without changing call semantics.

mkazantsev added inline comments.Feb 10 2019, 9:54 PM
llvm/include/llvm/Analysis/DomTreeUpdater.h
137 ↗(On Diff #185734)

That looks *highly* fishy. Between which two points we should have exactly one update?

For example, if we use eager strategy and we need to delete 10 edges and insert 10 other edges, is that guaranteed to work if we consecutively make 1 CFG update followed by 1 DT update, and do so for every edge? Can we interpret it as making exactly one update 20 times over?

Is that exactly one update between two flushes for lazy strategy?

Hello @mkazantsev.

That looks *highly* fishy. Between which two points [of insertEdge()] we should have exactly one update?

If you are using the Eager strategy these updates will be immediately applied to the DT and PDT internal to the DTU class. As the comment states there are internal checks of the CFG to determine if the insertion is valid. Because DT, PDT, and CFG are three distinct and separate structures we have to treat the CFG as the canonical shape of the IR. In Lazy strategy mode the update is simply queued until the next flush event.

For example, if we use eager strategy and we need to delete 10 edges and insert 10 other edges, is that guaranteed to work if we consecutively make 1 CFG update followed by 1 DT update, and do so for every edge? Can we interpret it as making exactly one update 20 times over?

Yes, if and only if the state of the CFG is considered valid after each single deletion and insertion event (see number 3 below). For most cases it's recommended to use the applyUpdates() method in this case for several reasons:

  1. There may be updates that cancel each other out {Insert, A, B} ... {Delete, A, B}. It is much more efficient to detect these before altering the DT and PDT as a no-op.
  2. It's often the case at the call-site the code iterates over dozens of edges and needs to alter several of them in the CFG. Because the DTU class requires a stable CFG state to check valid insertions and deletions it's often better to send the updates to DTU after all the changes are done.
  3. There are cases when it's impossible to make the CFG correct for a single edge change. There are often cases where 2-4 edges have to be altered before the CFG is considered valid. In this case you'll actually cause a DTU assert if you attempt to perform insertEdge() for every edge because it will fail when attempting to check the insertion in the CFG.
  4. Single calls to insertEdge() and deleteEdge() incur function call overhead and multiple checks that are aggregated in the case of a single call to applyUpdates().

There are valid reasons to use insertEdge() and deleteEdge(). The primary reason is the caller needs to only make exactly one change to DTU. The call to applyUpdates() has a bit of extra overhead to sanitize the update list and only makes sense if there are 2 or more updates.

Is that exactly one update between two flushes for lazy strategy?

I'm not sure how to answer this question. The idea of exactly one update for Lazy strategy is much more difficult to determine. The update may be a no-op. It may also be enqueued only to be removed by a later update before a flush event as a new no-op pair. The update might be invalid but DTU was called through one of the *Relaxed() methods.

I will say that if you do the following:

DTU->flush();
// Alter CFG to add a new edge A -> B
DTU->insertEdge(A, B);
DTU->flush();

Will perform exactly one update at the DTU level. This may cause two sub-updates if both DT and PDT are not nullPtr. Each of these sub-updates may incur dozens of edge changes in their respective DominatorTree graphs.

NutshellySima marked 2 inline comments as done.Feb 11 2019, 9:14 PM
NutshellySima added inline comments.
llvm/include/llvm/Analysis/DomTreeUpdater.h
127 ↗(On Diff #185734)

Considering updating DT/PDT under Eager mode is an idempotent action, I think it is applicable to delay the update validity scanning under Lazy mode to the time submitting updates to DT/PDT to avoid some fishy problems involved in JT before and also make applyUpdates() under Lazy mode also idempotent without using "point-based updates" which can have potential performance issues.

137 ↗(On Diff #185734)

When using DTU Eager UpdateStrategy or using DT/PDT directly, updating a DominatorTree doesn't follow associative law of conjunction because the internal updating logic of DomTree will do a graph search on the corresponding CFG each time you call the mutation APIs.
For example,

// Don't make any change to CFG
// Considering Eager DomTree updates is idempotent
DT.insertEdge(Foo, Bar); // Assert here if there is no Foo->Bar in the original CFG
DT.deleteEdge(Foo, Bar); // Assert here if there is Foo->Bar in the original CFG

Considering insertEdge/deleteEdge can bring confusion to devs, I'll consider remove those 2 APIs but

// Don't make any change to CFG
// Considering Eager DomTree updates is idempotent
DT.applyUpdates({Insert,Foo, Bar}); // Assert here if there is no Foo->Bar in the original CFG
DT.applyUpdates({Delete,Foo, Bar}); // Assert here if there is Foo->Bar in the original CFG

is also not OK overall.

mkazantsev added inline comments.Feb 12 2019, 2:59 AM
llvm/include/llvm/Analysis/DomTreeUpdater.h
137 ↗(On Diff #185734)

I mean the following scenario:

  1. Delete A->B in control flow
  2. DTU.deleteEdge(A, B)
  3. Delete B->C in control flow
  4. DTU.deleteEdge(B, C) ...

I can have two different interpretations of that:
a) It is not supposed to work under any strategy because I have more than 1 deletion.
b) It is only supposed to work under Eager strategy because at every point I make exactly one CFG update followed by DT update, and it is not supposed to work under Lazy strategy because I have more than 1 deletion.

From this comment, I cannot figure out which interpretation is correct (if any).

NutshellySima marked an inline comment as done.Feb 12 2019, 6:29 AM
NutshellySima added inline comments.
llvm/include/llvm/Analysis/DomTreeUpdater.h
137 ↗(On Diff #185734)

DTU can work in the scenario you mentioned.
The precondition you call any mutation API (insertEdge/deleteEdge/applyUpdates) is that the current CFG is the same as (the CFG which DTU is in sync+updates submitted).
So,

// CFG state a), DTU is now in sync with CFG state a)
1. Delete A->B in control flow // CFG state b)
2. DTU.deleteEdge(A, B) // `CFG state b)` matches `CFG state a)-Edge(A, B)`, so it's legal. DTU is now in sync with CFG state b)
3. Delete B->C in control flow // CFG state c)
4. DTU.deleteEdge(B, C) // `CFG state c)` matches `CFG state b)-Edge(B, C)`, so it's legal. DTU is now in sync with CFG state c).

The above scenario works under both Eager and Lazy modes.

You can also submit updates under *only* Lazy mode like:

  1. Delete A->B in control flow
  2. Delete B->C in control flow
  3. DTU.deleteEdge(A, B)
  4. DTU.deleteEdge(B, C)

But I don't think it's valuable to document this difference between Eager and Lazy mode because it will only cause confusion to others. I believe the aim of the document of DTU is to make a uniform rule to use both Eager and Lazy modes to encourage us to write code that is fully compatible under both modes, i.e. hiding the difference between them.

mkazantsev added inline comments.Feb 12 2019, 9:41 PM
llvm/include/llvm/Analysis/DomTreeUpdater.h
137 ↗(On Diff #185734)

This precondition you've mentioned should be met at *different* points under Lazy and Eager strategy. We DO NOT need this precondition after each insert/deleteEdge under Lazy strategy, we only need it when we flush. We DO need this precondition after each insert/deleteEdge under Eager strategy.

If I understand you correctly, there are actually two rules:

  1. Under the Eager strategy, flush is done after every insert/deleteEdge. Under the Lazy strategy, it should be called explicitly.
  2. By the moment when we make flush, the current CFG is the same as (the CFG which DTU is in sync+updates submitted).

This difference is important, and lack of it in documentation was the motivation for PR40528. I think it should be documented.

NutshellySima updated this revision to Diff 186630.EditedFeb 13 2019, 5:33 AM
NutshellySima set the repository for this revision to rG LLVM Github Monorepo.
  1. Add documents on flush APIs and clarify the precondition when flush APIs are called.
  2. Add information on recalculate().
  3. Add information on the invalid updates removal feature. (applyUpates())
  4. Remove documents on internal details which API users don't need to know about.
  5. Mark insertEdge*()/deleteEdge*() as deprecated.
  6. Address other comments.
NutshellySima marked 10 inline comments as done.Feb 13 2019, 5:47 AM
NutshellySima added inline comments.
llvm/include/llvm/Analysis/DomTreeUpdater.h
127 ↗(On Diff #185734)

I did an experiment on submitting updates already applied to DT, but I observed 1 assertion and 2 crashes running check-llvm. I guess it's because of this piece of code. And it is also helpful to make DTU deduplicate reliably (see D58170). So I keep this note.

kuhar accepted this revision.Feb 13 2019, 8:45 AM

Looks better now, found just a couple of nits.

llvm/include/llvm/Analysis/DomTreeUpdater.h
88 ↗(On Diff #186630)

nit: s/APIs/API

91 ↗(On Diff #186630)

DominatorTree and PostDominatorTree?

95 ↗(On Diff #186630)

s/call flush APIS/call the flush function

102 ↗(On Diff #186630)

s/Though/Although

122 ↗(On Diff #186630)

s/i.e./i.e.,

138 ↗(On Diff #186630)

Please mark the method with: LLVM_ATTRIBUTE_DEPRECATED
This can happen in a separate patch.

148 ↗(On Diff #186630)

I thought that we still need the relaxed version in order to make JumpThreading code happy? Should it be deprecated?

157 ↗(On Diff #186630)

Use LLVM_ATTRIBUTE_DEPRECATED

167 ↗(On Diff #186630)

like above

This revision is now accepted and ready to land.Feb 13 2019, 8:45 AM
NutshellySima added inline comments.Feb 13 2019, 9:04 AM
llvm/include/llvm/Analysis/DomTreeUpdater.h
148 ↗(On Diff #186630)

I think it is not needed to preserve the relaxed version because JT also submits invalid updates when calling applyUpdates(). And I believe D58170 can solve most of the problems in JT by making use of the extra information inferred from the fact that JT only submits updates that didn't happen but never submits updates that are already applied.

brzycki added inline comments.Feb 13 2019, 9:35 AM
llvm/include/llvm/Analysis/DomTreeUpdater.h
148 ↗(On Diff #186630)

I hope you are correct @NutshellySima . I never wanted to add the *relaxed() methods but there are situations in JumpThreading where we run for a very long time before calling flush(). For some F I was unable to detect all invalid cases.

It's probably worth testing to do so, although it's been quite a long time for me to recall exactly which testcases caused problems.

mkazantsev accepted this revision.Feb 13 2019, 9:17 PM

LGTM. Thakns a lot for doing this!

brzycki accepted this revision.Feb 15 2019, 7:55 AM

Address comments.

NutshellySima marked 10 inline comments as done.Feb 19 2019, 9:15 PM
This revision was automatically updated to reflect the committed changes.