insert/deleteEdge methods in DTU can make updates incorrectly in some cases
(see https://bugs.llvm.org/show_bug.cgi?id=40528), and it is recommended to
use applyUpdates methods instead when it is needed to make a mass update in CFG.
Details
Diff Detail
Event Timeline
I think the underlying problem is a bug of DTU, but am not 100% sure. In the given test, eager strategy does the following:
- Removes dead edge bb4->bb18;
- Tries to remove edge bb18->bb6, but the actual deletion does not happen because bb18 does not have a DT node;
- As result, after these two operations the IDOM of bb6 is bb3, while it should be bb29 after we have removed edge bb18->bb6.
To me it seems that at some point we screwed up making the probper updates for bb6. I think we should've done it when removed bb4->bb18. However the comments in DTU about that are fuzzy. They say that:
/// Notify all available trees on an edge deletion. If both DT and PDT are /// nullptrs, this function discards the update. Under either Strategy, /// self-dominance update will be removed. The Eager Strategy applies /// the update immediately while the Lazy Strategy queues the update. /// It is recommended to only use this method when you have exactly one /// deletion (and no insertions). It is recommended to use applyUpdates() in /// all other cases. This function has to be called *after* making the update /// on the actual CFG. An internal functions checks if the edge doesn't exist /// in the CFG in DEBUG mode. void deleteEdge(BasicBlock *From, BasicBlock *To);
I do not know what this recommendation is about. Does anybody know whether DTU should be able to work correctly with eager strategy?
Maybe @kuhar knows what should be the correct behavior of eager DTU updates here? I think it's a bug in DTU, but the comments are uncertain enough to give me doubts about that. :)
lib/Transforms/Scalar/LoopSimplifyCFG.cpp | ||
---|---|---|
326–327 | I have not looked at this in detail, but I've had a similar issue at one point in SimpleLoopUnswitch with DomTree. AFAICT, this applies for DTU as well. SmallVector<DominatorTree::UpdateType, 3> DTUpdates; DTUpdates.push_back({DT.Insert, NewPreheader, L.getHeader()}); DTUpdates.push_back({DT.Insert, Preheader, NewPreheader}); DTUpdates.push_back({DT.Delete, Preheader, L.getHeader()}); DTU.applyUpdates(DTUpdates); Perhaps the behavior is different with the Eager strategy, but I'm fairly confident using applyUpdates is correct for DT. |
lib/Transforms/Scalar/LoopSimplifyCFG.cpp | ||
---|---|---|
326–327 | With lazy strategy, flush() is doing exactly this thing, calling applyUpdates inside. |
It is indeed a bug in DTU, I was able to construct a unit test to reproduce it. https://bugs.llvm.org/show_bug.cgi?id=40528
You should use DTU.applyUpdates after a transformation that affects multiple CFG edges to inform DomTree about all of the things that changed since the last time it was updated. Eager strategy means that at each update point the updates are performed immediately, while the lazy update strategy is free to schedule them at any 'update' (synchronization) point between the first one and flush (possibly multiple time).
Perhaps we should improve the the documentation. @mkazantsev: do you have some suggestions?
@NutshellySima What was the reason we introduced DTU.insertEdge and DTU.deleteEdge? While they are essential primitives in the incremental updater, couldn't we get rid of it entirely? My memory is blurry.
When I read the documentation, I interpreted
It is recommended to only use this method when you have exactly one deletion (and no insertions). It is recommended to use applyUpdates() in all other cases.
as a profitability reasoning, not correctness reasoning. If we intentionally do not guarantee equivalence of Eager strategy and Lazy strategy (which is the same as applyUpdates), we should clearly state that "It is only legal to use this method when you have exactly one deletion...".
So if you think that the observed behavior is not a bug, I propose making this change. Otherwise, the bug should be fixed.
I've run some extensive Fuzz testing on my side for three 72 hours, it seems this one along with other fixes on review cover all bugs we had.
Under Eager update strategy, they act as an alias of DTU.applyUpdates. And under Lazy update strategy, they'll assert when insertEdge() is inserting a nonexistent edge / deleteEdge() is deleting an edge that still exists to detect possible misuses.
Anyway, it is OK to get rid of insertEdge/deleteEdge entirely if we remove the check I mentioned. (I didn't implement a similar check in DTU.applyUpdates under Lazy strategy because it is still an open debate whether trying "inserting a nonexistent edge/deleting an edge that still exists" is a good approach.)
You should replace insert/deleteEdge with applyUpdates -- I'd argue that even though it works with how the lazy strategy is implemented now, it is semantically incorrect because the DomTree updates you schedule don't match the CFG at the point of updates.
Then, it would be interesting to see whether:
- Eager strategy works (it should)
- How fast is eager vs lazy
Max, why don't you use DTU with .applyUpdates? Are you facing some difficulties, or found documentation confusing somewhere else?
I don't have much time recently, but I can definitely find enough to fix some issues people have with updating dominators. If you think something is not obvious or too difficult, there are probably multiple people who faced the same problems but never bother to share their experience.
The goal of DTU is take away as much low-level burden when updating dominators as possible. Ideally, nobody should have to use the raw updater in DomTree anymore, and the DTU api should enable us to cheaply update PostDominators without modifying the update logic.
That's why I'm opposed to using raw updates, unless absolutely necessary.
Actually I am just confused by the number of different ways to do the same thing, all having different bugs inside. If using DTU.applyUpdates is the right day now, I'm ok with that.
I have not looked at this in detail, but I've had a similar issue at one point in SimpleLoopUnswitch with DomTree. AFAICT, this applies for DTU as well.
Calling deleteEdge and insertEdge one after the other will not do the updates properly.
The correct way to do it is:
Perhaps the behavior is different with the Eager strategy, but I'm fairly confident using applyUpdates is correct for DT.