Do not reorder and move up a loop latch block when optimising for size because this will generate an extra unconditional branch. Also some more cleaning up of comments.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/CodeGen/MachineBlockPlacement.cpp | ||
---|---|---|
1112 ↗ | (On Diff #64511) | Can we change the two ifs into an if-else clause and move the "return nullptr" outside the if-else? (and also get rid of the braces after the ifs). This will make the code more compact and reduce duplication. |
lib/CodeGen/MachineBlockPlacement.cpp | ||
---|---|---|
1112 ↗ | (On Diff #64511) | I don't see how I can do this with an if-else and keep the different debug messages. I created these 2 different ifs because I wanted to see the exact reason for not selecting a new header. |
Please split the comment changes into a different patch
lib/CodeGen/MachineBlockPlacement.cpp | ||
---|---|---|
88 ↗ | (On Diff #64511) | Just one line comment defining "Outlining" should be put here. The rest do not fit. |
578 ↗ | (On Diff #64511) | Remove redundunat "and Pred needs to be ....". Actually I don't see the need of modification of this paragraph. |
658 ↗ | (On Diff #64511) | add "is less than probability threshold " |
Hi David, thanks for reviewing.
I will separate the functional changes from the tidy up. This patch implements the functional change.
Cheers.
lib/CodeGen/MachineBlockPlacement.cpp | ||
---|---|---|
928 ↗ | (On Diff #65040) | There is one case where the existing header is not the cfg successor of its layout predecessor. In this case, rotating loop does not actually add additional size cost. |
I have added checks to see if the header's layout predecessor is a fallthrough or unconditional branch to the loop header. In that case we do not move up the latch block, but we still do otherwise.
Can you add a test case to cover that rotation happens when there is no fall through predecessor for the header?
lib/CodeGen/MachineBlockPlacement.cpp | ||
---|---|---|
937 ↗ | (On Diff #65754) | This should be renamed to findFallthroughPredecessor and add a brief documentation. LayoutPredecessor by definition does not need to be its cfg predecessor. |
975 ↗ | (On Diff #65754) | if (...->canFallthrough()) { |
It's funny but I haven't been able to create an extra test case that still does the reordering. The issue in creating that test case is, i) when I manually reorder blocks in the .ll files to create a layout predecessos that cannot fallthrough, then somehow the blocks still end up in the original order again and there is a fallthrough, and ii) even for conditional branches to the loop the canFallThrough function returns true, which again prevents me from creating a case that does not fallthrough.
Re the new test case -- you can just modify the test case by branch directly to block %bb instead of %loop from entry.
Thanks for the suggestion!
I have been looking at it again and with the change to jump to bb from entry the cfg looks likes this:
entry------- | loop | / \ | blocka bb - - | exit
the problem however is that blocks are placed like this (before pass MBP):
entry
block_a
loop
bb
exit
and when it is looking for a best loop top for "loop", it finds that "block_a" is placed before "loop", and it can fallthrough to "loop". So this does not result in "block_a" being placed before "loop". Actually, while writing tests, there will always be some preheader that fallsthrough to the loop.
The fact that it is difficult to come up with a test is a bit worrying, in the sense that we add code in form of hasFallthroughLayoutPredecessor() that is always going to return true, so we will never end rotating the latch block to the top (when optimising for size). That in itself is fine, I think, because it shows that it in theory there is this case, but in practise it never happens. If we agree on this (I may have missed something), then we can make the implementation much simpler. We get rid of hasFallthroughLayoutPredecessor(), and simply don't do findBestLoopTop() when we optimize for size. By the way, while looking at this again I found that hasFallthroughLayoutPredecessor() is a bit dodgy because there is much simpler implementation possible, but it is not really important for now.
lgtm with the nit.
lib/CodeGen/MachineBlockPlacement.cpp | ||
---|---|---|
976 ↗ | (On Diff #68010) | if (F->getFunction()->optForSize()) .. |