This is an archive of the discontinued LLVM Phabricator instance.

[MBP] Don't outline short optional branches
ClosedPublic

Authored by djasper on Mar 6 2015, 8:00 AM.

Details

Reviewers
chandlerc
Summary

With the option -outline-optional-branches, LLVM will place optional branches out of line (more details on r231230).

With this patch, this is not done for short optional branches. A short optional branch is a branch containing a single block with an instruction count below a certain threshold (defaulting to 3). Still everything is guarded under -outline-optional-branches).

Outlining a short branch can't significantly improve code locality. It can however decrease performance because of the additional jmp and in cases where the optional branch is hot. This fixes a compile time regression I have observed in a benchmark.

Diff Detail

Event Timeline

djasper updated this revision to Diff 21359.Mar 6 2015, 8:00 AM
djasper retitled this revision from to [MBP] Don't outline short optional branches.
djasper updated this object.
djasper edited the test plan for this revision. (Show Details)
djasper added a reviewer: chandlerc.
djasper added a subscriber: Unknown Object (MLST).
djasper updated this object.Mar 6 2015, 8:03 AM
chandlerc added inline comments.Mar 6 2015, 1:48 PM
lib/CodeGen/MachineBlockPlacement.cpp
394

Shouldn't this be disabling the return of Succ here rather than arbitrarily returning Pred?

And in general, it seems very strange to look at all of the predecessors' predecessors here. That seems really expensive and it isn't clear why. This at least needs some serious commenting to clarify the algorithm used.

djasper updated this revision to Diff 21472.Mar 9 2015, 2:45 AM
djasper added inline comments.
lib/CodeGen/MachineBlockPlacement.cpp
394

Addressed both concerns.

chandlerc accepted this revision.Mar 20 2015, 1:40 AM
chandlerc edited edge metadata.

I'm fine for this to go in behind the flag. It makes the code strictly better.

However, looking at this and thinking about it is really confirming for me that this isn't quite the correct approach. Here are my thoughts, mostly for reference going forward:

Currently, the code is working to select a definite successor to lay out next in the chain. But some of the time, we don't want to do that because there is a non-definite successor that doesn't have a CFG conflict and is small. I feel like it would be better to phrase this whole thing from the perspective of actually *outlining*. Rather than returning a successor early, I wonder if it would work better to *skip* successors which are non-small and optional.

I've tried to think through how this would be implemented, and sadly it doesn't seem straight forward. I really feel like we're struggling with a much more deeply flawed design here and that is why nothing seems to work elegantly.

lib/CodeGen/MachineBlockPlacement.cpp
399–414

I feel like this code would be much easier to read as a lambda predicate.

This revision is now accepted and ready to land.Mar 20 2015, 1:40 AM
djasper closed this revision.Mar 20 2015, 3:03 AM

Turned into lambda and committed as r232802.