The original movitavion was the following code pattern coming out of
Clang itself:
define i64 @_ZNK5clang4Sema12getExprRangeEPNS_4ExprE(%"class.clang::Sema"* nocapture readnone %this, %"class.clang::Expr"* %E) { %1 = icmp eq %"class.clang::Expr"* %E, null br i1 %1, label %6, label %2 ; <label>:2 ; preds = %0 %3 = bitcast %"class.clang::Expr"* %E to %"class.clang::Stmt"* %4 = tail call i64 @_ZNK5clang4Stmt14getSourceRangeEv(%"class.clang::Stmt"* %3) %5 = and i64 %4, -4294967296 %phitmp = and i64 %4, 4294967295 br label %6 ; <label>:6 ; preds = %0, %2 %.sroa.3.0 = phi i64 [ %5, %2 ], [ 0, %0 ] %.sroa.0.0 = phi i64 [ %phitmp, %2 ], [ 0, %0 ] %7 = or i64 %.sroa.0.0, %.sroa.3.0 ret i64 %7 }
This really should simplify away, but it doesn't. My first attempt to
help teach LLVM about these kinds of patterns is this patch. It doesn't
handle the above case (where two separate PHI nodes would both need to
be simplified at the same time), but it handles many other cases.
When using a Clang with this patch to build the LLVM test suite, we
fully simplify approximately 2000 phi nodes, and we hoist a user into
one predecessor (simplifying the other predecessors) 57 times.
I'm still not sure this is the right approach. It is fairly heavy weight
(as you'll see) and I don't see a good way to extend its current
strategy to handle hoisting around multiple phi nodes. =/
I'm thinking of trying a different approach which would be a generic
combine firing on users of phi nodes, and try to hoist that user into
a predecessor to expose more simplifications. This would for example be
able to handle cases where both operands were phi nodes. But this
wouldn't completely obviate the patch I have here -- there are cases
that this will fire on that the other might not.
The other overall concern I have about this is that it really needs to
be "late" in instcombine. We only want to hoist around phi nodes after
every user of the instruction has already been combined and the current
instruction's other combines have already been exhausted. Unfortunately
this is the exact opposite order of combining from what instcombine
usually does. I'm not really sure how to address this.
Thoughts and suggestions on the issue at large very welcome!