Dead basic blocks may be forming a loop, for which SSA form is
fulfilled, but with a circular def-use chain. LoadCombine could
enter an infinite loop when analysing such dead code. This patch
solves the problem by simply avoiding to analyse all basic blocks
that aren't forward reachable, from function entry, in LoadCombine.
Details
Diff Detail
Event Timeline
Seems to solve:
opt /tmp/infinite.ll -jump-threading -load-combine
in https://bugs.llvm.org/show_bug.cgi?id=27065
lib/Transforms/Scalar/LoadCombine.cpp | ||
---|---|---|
233 | presumably checking IsAlive is cheaper than calling skipBasicBlock, so maybe it should be tested first? |
You don't want to :)
lib/IR/LegacyPassManager.cpp | ||
---|---|---|
1283 |
If you mean "forward unreachable", please use that.
I'm pretty strongly against forcing a bb order in a generic pass infrastructure.
Are they getting anything from being basicblockpasses? It looks like they could all be trivially changed to functionpasses, and those that can't handle dead code, could just be changed to do what they need to. For example, DCE has a dominator tree around, it can tell whether things are forward unreachable without wasting time doing a reverse postorder traversal. IMHO, that is a better solution (and helps new PM) than forcing a pass order here. |
lib/IR/LegacyPassManager.cpp | ||
---|---|---|
1283 | What do you mean by "forcing an order"? We have not changed the order in which basic block passes are executed here. All we did was adding the IsAlive flag to tell the basic block passes if the BB it should operate on is "forward reachable" (from the function entry) or not. And then it is up to each BasicBlockPass to make a choice if it should operate on the BB or not. The idea was that passes like the BasicBlockPassPrinter should be able to also print the dead blocks, whereas LoadCombine could avoid spending time on optimizing the code that most likely will be removed by DCE later anyway. So that is why we provide the flag instead of simply skipping to iterate over the "dead" blocks. |
lib/IR/LegacyPassManager.cpp | ||
---|---|---|
1283 | Yes, you are right, i misread that part. I'm still strongly of the opinion to simply kill basicblockpass rather than add another O(basic blocks) walk that not everything needs. Every other pass also properly handles dead or unreachable code themselves, without help from the pass infrastructure. |
I have updated according to most of the comments.
I think that the re-factoring of all basic block passes to function passed should be done as a separate task.
Note: You are still calling it "IsAlive", which again, does not really tell anyone what it is supposed to mean :)
However, ...
I think that the re-factoring of all basic block passes to function passed should be done as a separate task.
That's fine, if you want to fix it by making BB passes more expensive and doing another walk of all BB's, you are welcome to.
This, to me, feels like a hack-around to try to paper over how these passes operate.
Someone else can approve that if they are comfortable with it :)
Solution was changed into using the DominatorTree analysis to
determine if basic blocks are "dead". This makes the patch
local to LoadCombine.cpp.
KennethH gave me permission to upload the improved solution (that uses DominatorTree analysis). This feels like a more "kosher" solution.
The currently suggested solution (for avoiding to optimize on dead code in LoadCombine) is similar to the solution used in BBVectorize. So there is really nothing new invented here.
I'm just afraid that I might have missed something fundamental about "adding a new analysis" (my experience in playing around with analysis passes is very limited).
So any feedback is welcome!
This looks pretty straightforward to me. I think if there are no other concerns, I think this should go in.
lib/Transforms/Scalar/LoadCombine.cpp | ||
---|---|---|
69 | I'm not an expert on LoadCombine but I think this preserve is correct because LoadCombine doesn't do any cross-BB combining or modifies the CFG structure. That said, if what I say is correct, I expect setPreserveCFG to preserve the DominatorTree already? Can you please double check? | |
test/Transforms/LoadCombine/deadcode.ll | ||
2 | Can you add a CHECK line, to make sure we leave the block alone? |
Addressed comments from davide:
- Removed the not needed explicit addPreserved for DominatorTree (it is included in setPreservesCFG).
- Generated CHECKs in the test case.
I also added a second basic block to the test case. This was useful for some manual test to see that the Dominator Tree only is created once per function, even if the LoadCombine pass is executed multiple times (for each BB). I.e. checking that the DomTree actually is preserved.
If you mean "forward unreachable", please use that.
I'm pretty strongly against forcing a bb order in a generic pass infrastructure.
Are they getting anything from being basicblockpasses?
It looks like they could all be trivially changed to functionpasses, and those that can't handle dead code, could just be changed to do what they need to.
For example, DCE has a dominator tree around, it can tell whether things are forward unreachable without wasting time doing a reverse postorder traversal.
IMHO, that is a better solution (and helps new PM) than forcing a pass order here.