The pass will now be more of a "full loop unswitching" pass rather than
anything substantively simpler than any other approach. I plan to rename
it accordingly once the dust settles.
The key ideas of the new loop unswitcher are carried over for
non-trivial unswitching:
- Fully unswitch a branch or switch instruction from inside of a loop to outside of it.
- Update the CFG and IR. This avoids needing to "remember" the unswitched branches as well as avoiding excessively cloning and reliance on complex parts of simplify-cfg to cleanup the cfg.
- Update the analyses rather than just blowing them away or relying on something else updating them.
Sadly, #3 is somewhat compromised here as the dominator tree updates
were too complex for me to want to reason about. I may take another stab
at it, but it may be best to wait for something like the newly proposed
dynamic dominators. However, we do adhere to #3 w.r.t. LoopInfo.
This approach also an some important principles specific to non-trivial
unswitching: not *all* of the loop will be duplicated when unswitching.
This fact allows us to compute the cost in terms of how much *duplicate*
code is inserted rather than just on raw size. Unswitching conditions
which essentialy partition loops will work regardless of how large the
loop is in reality.
Unfortunately, there is a *lot* of code to implement all of this. And
I'm still not really happy with all of it. I think there is more
factoring and cleanup to be done here, but I wanted to at least get it
out where others can see and comment on it.
Some high level outstanding things that I'd like to defer to subsequent
patches:
- We could be much more clever about not cloning things that will be deleted. In fact, we should be able to delete *nothing* and do a minimal number of clones.
- There are many more interesting selection criteria for which branch to unswitch that we might want to look at. One that I'm interested in particularly are a set of conditions which all exit the loop and which can be merged into a single unswitched test of them.
Anyways, even with somewhat rough code, hopefuly folks can start chewing
on this and giving feedback.
Depends on D34049.
I have a question about the default value false of "enable-nontrivial-unswitch". Could it be changed to true because it seems that it can bring much improvement of bmk when enable it. Or is there any reason to stop it?