If we have an icmp->brcond->br sequence where the brcond just branches to the next block jumping over the br, while the br takes the false edge, then we can modify the conditional branch to jump to the br's target while inverting the condition of the incoming icmp. This means we can eliminate the br as an unconditional branch to the fallthrough block.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
I haven't read the patch yet, but just from reading the description I remembered that the DAG does this while building up the DAG. Maybe this needs to be done in the IRTranslator to keep this behavior? (This helps while comparing codegen with sdag/gisel).
llvm/lib/Target/AArch64/AArch64InstructionSelector.cpp | ||
---|---|---|
1171 ↗ | (On Diff #208448) | Taking advantage of fall-through sounds like something beneficial for all targets. (Then you can call this specific helper in here.) |
FWIW I would like some control over what boolean sources are acceptable to invert. SelectionDAG inserting a condition invert between a control flow intrinsic and the branch required work to undo
This seemed a little presumptive to do for all targets for me. I can try to move it earlier into a Combiner helper, since I don't want to put optimizations into the IRTranslator unless we can't do it in other places.
In this case I'm just inverting the condition code, we shouldn't need to insert extra instructions.
llvm/lib/Target/AArch64/AArch64InstructionSelector.cpp | ||
---|---|---|
1186 ↗ | (On Diff #208448) | Branch is a terminator by nature, so it doesn't check the same thing, although the usefulness of this assert is debatable. |
1197 ↗ | (On Diff #208448) | For arm64 it seems to be the only thing in between compares and branches in practice due to our legalization rules. This code will be removed anyway since I'm going to put this before the legalizer so we shouldn't see any artifacts in between the ops, |
New patch to do the optimization in the CombinerHelper and add it to the prelegalizer combiner for AArch64.
This doesn't get all the code size improvements of the last patch, I'll make that a separate patch. On it's own, this change gives a 2.5% geomean improvement on CTMark -O0.
Unless others have comments, this looks good to go in to me.
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
392 ↗ | (On Diff #208510) | Orthogonal to this patch - I've found that MachineOperand::getPredicate returning an unsigned and users having to explicitly cast to Predicate is a little annoying. What do you think of getPredicate returning Predicate type (in a separate patch)? In my use cases I've almost never used the unsigned value without casting to Predicate. |
Marking as accepted.
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
392 ↗ | (On Diff #208510) | That sounds reasonable to me. |