Currently the following C example fails to be partially inlined. The issue is that computeOutliningColdRegionsInfo checks for candidate blocks that have a single entry, but blocks that dominate only themselves fail the check. Example (this is also added as a LLVM IR test):
int callee(int c1, int c2) { int rc = 0; switch(c1) { case 0: // cold rc = 1; break; case 1: // warm rc = 2; break; case 2: // cold rc = 4; break; default: //hot rc = c2; } return rc; } int caller(int c1) { int rc = callee(c1, c1); return rc; }
With this patch the code in the 2 cold switch cases will be outlined and the remaining function inlined in its caller.
This change allows the single basic block to have multiple predecessor, which is what this function want to avoid.
How about changing BlockList.size() > 1 && to BlockList.size() >= 1 &&?