This extends DeferredDominance to hold an optional PostDomTree, and apply the same set of updated to both dom trees. Loop Simplify is then switched over to use DeferredDominance, and applies the correct set of edge updates to preserve the two trees at once.
Details
- Reviewers
- None
Diff Detail
Event Timeline
I will try to rebase this on top of d40146 once it goes in. Providing I can get the applyUpdates to work.
I assume this depends on D41299?
lib/Target/AMDGPU/SIAnnotateControlFlow.cpp | ||
---|---|---|
375 ↗ | (On Diff #127155) | Why do we try to preserve the DominatorTree here, but not the PostDominatorTree? Here and in other files below |
lib/Transforms/Utils/BasicBlockUtils.cpp | ||
324 | Maybe add a comment here like for updating the dominator tree? | |
522 | Update comment to include PostDominator | |
606 | Update comment to include postdominator | |
lib/Transforms/Utils/LoopSimplify.cpp | ||
470 | Update postdominator comment? PDT is optional here but DT is not. I assume that has to do with some instances passing nullptr as above. Is there any reason we cannot make PDT required?`I am not saying we need to do all this in a single commit, but we should keep that in mind. | |
845 | unrelated whitespace change. |
Hello. Thanks for taking a look.
After starting at D40146 I think a lot of this will end up changing. I will have to look into what that will end up looking like once it is in and I have some time to update things.
I've tried the answer the questions that are relevant to that. I have another patch for the same thing in LICM, which is mostly trivial on top of this.
lib/Target/AMDGPU/SIAnnotateControlFlow.cpp | ||
---|---|---|
375 ↗ | (On Diff #127155) | I think the only place we currently use PostDominatorTrees in the default pipeline is in ADCE. Theres also places like GVNHoist, but it's off at the moment. We will likely have to add post doms to DSE, which has LICM (and hence LoopSimplify) in between it and ADCE. So if we can at least get these two passes to preserve the tree, we will get the post dom tree for free (ish) :) So most places currently don't know about PostDomTrees. DomTrees and much more common. In the long run it would be good to get them to preserve, where it makes sense, but it can be a bit fiddly to make sure its correct. This interface will likely change to not take a PostDomTree, taking an updater object instead. |
lib/Transforms/Utils/BasicBlockUtils.cpp | ||
522 | Will do. I'll update them when I have a new version. | |
lib/Transforms/Utils/LoopSimplify.cpp | ||
470 | This may turn into a DeferredDominance object holding a DT and PDT that applies the same updates to each. The PDT will be rarer and probably still optional, but hopefully a lot of the interface will simplify. |
I don't know what people will think of the interfaces for SplitBlockPredecessors and simplifyLoop. Let me know if theres a better way I should be doing that. I also removed the old splitBlock from Dominator trees, replacing it with a series of deferred updates. And I made DeferredDominator flush in it's deconstructor. And I may need to move DeferredDominance into Analysis to use PostDomTrees - I'll check this.
Looks nice, I think that using DeferredDominance as a single DomTree updated object is the right direction.
I just quickly went though the patch and left some comments, but I think this approach is good enough for the first step. Ideally, DDT should update DT first and use the reachability information to update PDT faster. I will comment more on this later when I find some more time.
include/llvm/Analysis/PostDominators.h | ||
---|---|---|
33 | Nice! | |
lib/Transforms/Utils/BasicBlockUtils.cpp | ||
324 | We can reserve Updates here. 1 + 2 * Preds.size()? | |
350 | Do we always have to flush DDT when we get to this condition here? I think that it's possible to tell that Pred is not reachable if it's enqueued for deletion. | |
350 | Just eyeballing the function, this seems to be the only use od DT in the rest of this function. I'm not sure if always flushing it above is not too conservative/costly. | |
538 | SplitBlockPredecessors(BB, Preds, Suffix, DT ? DDT(*DT) : nullptr, LI, PreserveLCSSA);? | |
lib/Transforms/Utils/LoopSimplify.cpp | ||
160 | Like above. | |
469 | Nit: we can reserve here as well. | |
632 | I would appreciate an extra line here, the continue blends in and it's easy to miss it. | |
666 | SmallVector? | |
668 | Can you rewrite it as a series of if's? It's difficult to parse for me. | |
763 | Same as above. | |
lib/Transforms/Utils/LoopUtils.cpp | ||
1136 | Same. | |
test/Transforms/LoopSimplify/2004-04-13-LoopSimplifyUpdateDomFrontier.ll | ||
1 | Is PDT required? Maybe we should test both with and without it? |
Hello. Sorry for not getting back to this. You gave a good review, but I haven't had the chance to do anything with this (its been "second" on todo list for the longest time). Essentially I dislike the duplicated functions here enough that I don't think it's workable as-is.
lib/Transforms/Utils/BasicBlockUtils.cpp | ||
---|---|---|
350 | I think, because this is already working in loop simplify, the number of updates will not be big. It already works fine with an non-deferred updater, so the cost of doing DT updates will likely not be massive. Could a block be already unreachable, but not queued for deletion? | |
538 | DDT is currently passed as a pointer and takes a reference to a DT. So we dont want to create one if DT is nullptr, and I dont think using SplitBlockPredecessors(.., DT ? &DeferredDominance(DT) : nullptr, ..) is valid use of a rvalue. I also dont think having these multiple function definitions is sensible/scalable. My current best guess for how to do this properly and incrementally would be to have something like a "DominatorTreeUpdater" class that both DT and DDT inherrit from. It would have an interface similar to DDT here (insertEdge, deleteEdge, applyUpdate, getDT, getPDT), and would allow a DT and a DDT (holding a DT/PDT) to be used identically. Functions such as this one could then be switched over to use the new class type without changing any external code. There may be better ideas out there. | |
lib/Transforms/Utils/LoopSimplify.cpp | ||
668 | I agree :) |
lib/Transforms/Utils/BasicBlockUtils.cpp | ||
---|---|---|
350 | OK.
I'm not sure if it makes sense in this API, but in general yes. And a previously unreachable block can also be (non-trivially) enqueued for insertion | |
538 | I missed the pointer/reference difference here. I'm fine with how it is right now, but I really think we have to consider better design in the future. |
Nice!