Merge basic blocks that are exactly the same, inspired by https://bugs.llvm.org/show_bug.cgi?id=46402.
There's stuff to figure out here first, such as a 350% compile-time regression on CMakeFiles/clamscan.dir/libclamav_nsis_LZMADecode.c.o.
Paths
| Differential D82730
[SimplifyCFG] Merge identical basic blocks (WIP) AbandonedPublic Authored by nikic on Jun 28 2020, 2:15 PM.
Details
Summary Merge basic blocks that are exactly the same, inspired by https://bugs.llvm.org/show_bug.cgi?id=46402. There's stuff to figure out here first, such as a 350% compile-time regression on CMakeFiles/clamscan.dir/libclamav_nsis_LZMADecode.c.o.
Diff Detail Event TimelineComment Actions I think @AndrewLitteken & @paquette are working on adding infrastructure for IR outlining. Might be worth syncing up on sharing the IR equivalence checks/hashing. It would be good to have a single place where we define those kinds of IR equivalence for merging/outlining, rather than having multiple slightly diverging ones. Also, I am not sure if SimplifyCFG is the ideal place to do this transform, as it runs quite a few times in the pipeline. I think running the merging sometime after GVN should allow us to catch most cases. Also, various parts in the backend do not handle huge BBs well (in terms of compile time), so we should probably avoid creating BBs that are too large. Comment Actions Improve performance, mainly by hashing in successors, which eliminates a lot of blocks that only differ in successors. Comment Actions
For reference, these are the current compile-time numbers: http://llvm-compile-time-tracker.com/compare.php?from=ef79026aad943fa9b5c12d462127cd9dfa9ea0d8&to=8fb2407c2f795d92f2e5d69a483ab749eff488f2&stat=instructions After fixing some obvious issues, the merging itself appears to be quite cheap. The big compile-time effect (in either direction) is because this changes the amount of IR (text-size changes) and has impact on following optimizations (the largest regression is addressed by D82799). That said, the particular placement here is just an initial attempt that puts it next to the existing merging code for return blocks, it may not be the best placement. Having it as a separate pass, or in a different position in SimplifyCFG may make more sense. Performing merging after GVN seems too late. The block merging is intended not just as a code-size optimization, but also as a means to enable further control-flow optimization. As such, it should probably at least happen before jump threading.
I'm not immediately seeing how this optimization would result in larger blocks? Comment Actions We're using a similar method for how to do the Instruction equivalence checking by using isSameOperationAs for most of the checking, but is there a reason you are not using isIdenticalTo when checking that instructions are equivalent? It looks like it is doing much of the same thing, including checking if operands and phi nodes are the same. Comment Actions I like this, but it's going to run into the same issues as MergeFuncs. Can you sync with @hiraditya and @vish99 to solve both block comparison and function comparison using the same methodology? The we can turn both of them on by default. Or in other words, how does this not have exactly the same issues MergeFuncs has? The principal one is that the IR comparator goes out of sync with IR definition, and we get miscompiles because of this. Block hashing seems useful. mergeFuncs also hashes IR, but for the whole function. I wonder if hashing blocks should be an analysis pass, which we invalidate when we change IR, and which both this pass as well as MergeFuncs consumes (i.e. try to de-dup functions, then blocks, and use the block hash to create a hash for entire functions). Comment Actions SimplifyCFG doesn't look like a good place to put this as it is called multiple times and impacts compile time significantly. Comment Actions
isIdenticalTo isn't being used here, because the operand equality check does not use identity. In particular, for blocks like %x1 = add 1, 2 %y1 = add %x1, 3 %x2 = add 1, 2 %y2 = add %x2, 3 We need to determine that the blocks are the same, but %x1 and %x2 are only equal after remapping. That's what the ValuesEqual predicate checks, and how this differs from isIdenticalTo.
The issue with MergeFuncs is that it does not use simple equality. First, it wants to merge code that is working on different, but bitwise equivalent, types. Second, IIRC it needs an ordering, not just an equality comparison. For that reason it reimplements the comparison logic from scratch. This transform is based on isSameOperationAs(), which is a standard instruction comparison operation used by a number of other transforms (including SimplifyCFG).
Possibly, but I suspect it's not worthwhile to treat it as an analysis pass. The hash calculation is not particularly expensive as these things go, and would be invalidated by pretty much any transform. Comment Actions Improve phi handling. Blocks with phi nodes can be merged, but it requires some care when backedges are involved.
Comment Actions Not working on this at present, and this would need better heuristics on when it is profitable to do this.
Revision Contents
Diff 275554 llvm/lib/Transforms/Scalar/SimplifyCFGPass.cpp
llvm/lib/Transforms/Utils/SimplifyCFG.cpp
llvm/test/Transforms/LoopDeletion/simplify-then-delete.ll
llvm/test/Transforms/LoopUnswitch/infinite-loop.ll
llvm/test/Transforms/PGOProfile/chr.ll
llvm/test/Transforms/SimplifyCFG/ForwardSwitchConditionToPHI.ll
llvm/test/Transforms/SimplifyCFG/HoistCode.ll
llvm/test/Transforms/SimplifyCFG/X86/switch_to_lookup_table.ll
llvm/test/Transforms/SimplifyCFG/duplicate-landingpad.ll
llvm/test/Transforms/SimplifyCFG/merge-blocks.ll
llvm/test/Transforms/SimplifyCFG/wc-widen-block.ll
|
I have a similar checking mechanism for checking if two sections of IR are similar, so it might make sense to create a shared utility function for the instruction mapping/comparison section. Where a basic block or range of instructions is passed in and the "isSameOperationAs" check for the range is performed, then if we develop new mechanisms for whether these sections are the same operations both passes get the same gains.