I'd like to propose changing the order of the SimplifyCFG and SROA passes run. There are optimizations in SimplifyCFG that are enabled by SROA. Presently, SimplifyCFG is run before and after SROA, but some less advantageous SimplifyCFG transformations can run in the first pass and deny better SimplifyCFG transformations a chance. Changing the order of these two passes will resolve performance bug 27555 (https://llvm.org/bugs/show_bug.cgi?id=27555). I don't know how the ordering of these two passes was decided initially, so I am hoping whether this order was deliberate.
Specifically, there are two functions in SimplifyCFG and deal with long sequences of comparisons: FoldValueComparisonIntoPredecessors and FoldingBranchToCommonDest. These two transformations have overlapping opportunities, but when both are applicable, FoldValueComparisonIntoPredecessors is typically better. Presently, even though FoldValueComparisonIntoPredecessors runs before FoldingBranchToCommonDest, FoldingBranchToCommonDest wins, because FoldValueComparisonIntoPredecessors needs to run after values are promoted to registers, whereas FoldingBranchToCommonDest can run before.
FoldingBranchToCommonDest transforms code that looks like:
if (A) goto LabelX
if (B) goto LabelX
if (C) goto LabelX
if(A || B || C || ...) goto LabelX
where A, B, C, etc are non-side-effecting expressions.
FoldValueComparisonIntoPredecessors transforms code that looks like:
if (x == A) goto LabelX
if (x == B) goto LabelY
if (x == C) goto LabelZ
case A: goto LabelX case B: goto LabelY case C: goto LabelZ ...
where A, B, C, etc are constant values.
In situations where both transformations apply, FoldValueComparisonIntoPredecessors tends to be better, since LLVM can lower the Select more efficiently than the complex boolean expression generated by FoldingBranchToCommonDest.