This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/
-
llvm/
-
CodeGen/
1/2
Passes.h
-
InitializePasses.h
-
lib/CodeGen/
-
CodeGen/
71/103
BranchCoalescing.cpp
-
CMakeLists.txt
-
CodeGen.cpp
-
TargetPassConfig.cpp
-
test/CodeGen/
-
CodeGen/
-
PowerPC/
-
branch_coalesce.ll
-
select-i1-vs-i1.ll
-
Thumb/
-
select.ll

Differential D28249

Improve scheduling with branch coalescing
ClosedPublic

Authored by lei on Jan 3 2017, 12:39 PM.

Download Raw Diff

Details

Reviewers

echristo
Carrot
kbarton
hfinkel
nemanjai

Commits

rGb223cfabcc6b: Improve scheduling with branch coalescing
rL296670: Improve scheduling with branch coalescing

Summary

Improve scheduling by coalescing branches that depend on the same condition. This pass looks for blocks that are guarded by the same branch condition in the IR and attempts to merge the blocks together. This is done by moving code either up/down to it’s predecessor/successor blocks.

On power8 LE, we see a 11% improvement for lbm and 28% improvement for mcf (SPEC2006).

I tried the following test on ARM and X86.

$ cat branchC.ll
; RUN: llc -mcpu=generic -mtriple=x86_64-unknown-linux -verify-machineinstrs < %s | FileCheck %s
; RUN: llc -mtriple=armv6-unknown-linux-gnu < %s | FileCheck %s
; RUN: llc -verify-machineinstrs -o - %s -mtriple=aarch64-linux-gnu | FileCheck %s

; Function Attrs: nounwind
define double @testBranchCoal(double %a, double %b, double %c, i32 %x) {
entry:

%test = icmp eq i32 %x, 0
%tmp1 = select i1 %test, double %a, double 2.000000e-03
%tmp2 = select i1 %test, double %b, double 0.000000e+00
%tmp3 = select i1 %test, double %c, double 5.000000e-03

%res1 = fadd double %tmp1, %tmp2
%result = fadd double %res1, %tmp3
ret double %result

}

This does not affect ARM since the LLVM IR produced does not conform to the pattern we expected.
For X86, the branches were not coalesced since the terminator produced contain implicit operands. This code will only coalesce branches whose terminators contain explicit operands.

This is originally reported in: https://llvm.org/bugs/show_bug.cgi?id=25219

Diff Detail

Event Timeline

lei updated this revision to Diff 82932.Jan 3 2017, 12:39 PM

lei retitled this revision from to Improve scheduling with branch coalescing.

lei updated this object.

lei added reviewers: Carrot, nemanjai, kbarton, echristo.

lei added a subscriber: llvm-commits.

Herald added subscribers: mgorny, mehdi_amini, aemerson. · View Herald TranscriptJan 3 2017, 12:39 PM

lei added a reviewer: hfinkel.Jan 3 2017, 12:44 PM

lei added subscribers: jtony, sfertile, syzaara, craig.topper.

These inline comments are based on a first-pass reading of the patch. As such, they refer to issues that are rather local in scope. This is a large patch so I'll set aside some more time to fully understand it in its entirety in the next review cycle.

lib/CodeGen/BranchCoalescing.cpp
218	We want to reset the stats for every function? I would have assumed we're interested in collecting this for the entire module.
228	I am personally not a fan of functions that modify an output parameter and then fail. Presumably this makes no difference now, but if we end up extending this pass in the future to have some backtracking/requeuing capabilities, it would be nice if the object wasn't modified/invalidated in some way.
337	Use same debug format for both (i.e. space after the colon).
345	Won't this fail (as in assert) if `Op2.isReg() != true`?
357	Maybe a comment as to why we fail here if we get physical registers. Also, if I follow the control flow here correctly, we will return `true` here if one is a physical register and the other is a virtual register. Is that what we want? I imagine not.
379	Not that it makes a difference computationally, but perhaps for consistency: for (MachineBasicBlock::instr_iterator Instr : MBB.instrs()) if (Instr->isPHI()) return Instr; return MBB.instr_end(); Also, I'm not sure what the style guide says regarding the use of `auto` there, but perhaps instead of `MachineBasicBlock::instr_iterator`, it should be `auto`. Finally, if every use of this function will just use `begin()` instead of `end()` when there are no existing PHI's, perhaps it would make sense to actually return `begin()` if there are no PHI's in the MBB.
394	Can you please state why something like `MachineBasicBlock::SkipPHIsLabelsAndDebug()` cannot be used instead of this function (thereby eliminating the need for that function altogether)?
410	You should have a range of iterators that represent the PHI's in the `From` block. Why not use the range-version of `splice()` after this loop? If it is to avoid the empty range check on the PHI's in the `From` block, maybe just a comment to that end.
416	The second sentence is redundant.
420	'... used in this block ...' means '... used in original block ...'? Overall, I think this comment is kind of hard to follow. I think it would suffice to say when an instruction can move to the begin/end location. Something along the lines of: An instruction MI can move from MBB Source to MBB Target under the following conditions: There are no PHI's in Target that use what MI defines No instructions in Source use what MI defines (unless Target is a dominator of Source) No instructions in Source define what MI uses (unless Source is a dominator of Target) If any instructions in Target define what MI uses, the MI can only move to the end of Target
443	I don't understand the purpose of the comment.
467	I don't think this function suffices to determine whether an instruction can move to a block. Example MBB's: BB#0: ... ; some non-PHI def of %vreg2 ; some non-PHI def of %vreg3 %vreg10 = ADD %vreg2, %vreg3 ; MI ... BB#1: ; MBB1 %vreg5 = PHI %vreg10, <BB#0>, %vreg4, <BB#2> ... The call: `canMoveTo(MI, MBB1, false)` Would return true, but it is not safe to do so. I think this patch uses this query in very limited circumstances, so it isn't necessarily required for this function to determine whether it is safe to move an arbitrary instruction to an arbitrary block. But if this is the case, we should add asserts to eliminate incorrect uses of the function (in the future).
508	Why are all of these asserts rather than early exits returning false? On builds that compile asserts out, I imagine we could end up doing the wrong thing.
511	I don't think there's anything to be gained from printing the name of the function emitting the message in the debug output. Please remove this.
555	The use of 'From', 'from', 'To' and 'to' makes the comment confusing. Please consider renaming the parameters or otherwise clarifying this comment. Perhaps something like: `Merge the CFG triangle \p From into the preceding CFG triangle \p To.`
609	Are these two asserts enough or does `From.BranchBlock` need to [immediately] dominate both `To.BranchTargetBlock` and `To.FallThroughBlock`?
621	You've already moved all the PHI's from `From.BranchBlock`. Can't this just be a loop over `From.BranchBlock->instrs()`? Or it may need to be a loop from `rbegin()` to `rend()`.
627	This is loop-invariant. I think it'd be cleaner to move the assert out of the loop and call `splice()` on the result of a ternary operator. Maybe even invert the `if` and the loop.
656	Where is the assert to ensure this?

lei marked 12 inline comments as done.Jan 22 2017, 9:58 AM

lei added inline comments.

lib/CodeGen/BranchCoalescing.cpp
218	Initialization removed.
228	agreed.
337	fixed
345	No, since we check that Op2 and Op1 are the same type on line 330. We do an early exit if they are not the same type. getType() returns an enum MachineOperandType, isReg() basically checks that the MachineOperand's type (MachineOperandType) is of the specific enum MO_Register.
357	It was just put in to see if we trip on it. I have removed this.
379	Can begin() also be a PHI? If not, then that is okay, if it can then it is better to return end() so that we know there are no PHIs in this block.
394	MachineBasicBlock::SkipPHIsLabelsAndDebug() returns the iterator to the first NON PHI, non-label instruction. We want the iterator to the first PHI node in a MBB. I don't see any function within that class that returns the iterator to the first PHI.
416	agreed
420	Arguments renamed and doc updated to be more clear.
467	You are right, this function alone can not be used to determine whether an instruction can move to a random block. This function is a helper function used by canMerge() to determine whether 2 candidates can be merged.
508	This seem to be the case for other classes like MachineOperand. If assert is off it could end up doing the wrong thing....
511	okay
555	Renamed "From/To" to "SourceRegion/TargetRegion". I believe this is now less confusing ...
609	From.BranchBlock MUST equal To.BranchTargetBlock, not dominate.
621	agreed
627	assert moved to top of function
656	In analyzeBranch(), line 297, we do an early exit if the FallThroughBlock contain code.

lei marked 12 inline comments as done.Jan 22 2017, 10:29 AM

lei added inline comments.

lib/CodeGen/BranchCoalescing.cpp
621	changing this to be a range based splice call instead.

nemanjai added inline comments.Jan 22 2017, 12:21 PM

lib/CodeGen/BranchCoalescing.cpp
345	Oh OK. I missed the early exit check. I think the layout of this function is a little confusing so I can see why I missed it. It seems that the conditions under which we return true are (any of): The lists are pair-wise identical For any pairs that aren't identical, they're both registers that produce the same value Why not such a simple layout to the function? For example, your check for same type is repeated in the call to `MachineOperand::isIdenticalTo()`. It seems to me like a layout such as this for the loop encapsulates what this function should do: for (unsigned i = 0; i < OpList1.size(); ++i) { const MachineOperand &Op1 = OpList1[i]; const MachineOperand &Op2 = OpList2[i]; DEBUG(dbgs() << "Op1: " << Op1 << "\n" << "Op2: " << Op2 << "\n"); if (Op1.isIdenticalTo(Op2)) { DEBUG(dbgs() << "Op1 and Op2 are identical!\n"); return true; } if (Op1.isReg() && Op2.isReg() && TargetRegisterInfo::isVirtualRegister(Op1.getReg()) && TargetRegisterInfo::isVirtualRegister(Op2.getReg())) { if (TII->produceSameValue(Op1Def, Op2Def, MRI)) { DEBUG(dbgs() << "Operands produce the same value.\n"); return true; } } DEBUG(dbgs() << "The operands are not provably identical.\n"); return false; }
357	I'm afraid this won't suffice. If it were to somehow happen that (at least) one is a physical register, you'll return true here - regardless of whether it's the same register (not that it really matters whether it's the same register or not).
379	Are you referring to some potential future use of this function? Because this result isn't currently used to determine whether the block has any PHI's. But I suppose you're right it is potentially dangerous for it to return a valid iterator when there are no PHI's. Also, the work this function does just seems like overkill. It seems like you should only be checking instructions between `MBB.instr_begin()` and `MBB.getFirstNonPHI()`. Why does that not suffice? In any case, considering this function is only used once and in many cases it'll return a value that you're just going to discard and use the `begin()` pointer, I don't think this function should exist. Just find the insert location in the function where you need it.
387	It seems below like you're inserting them before any existing PHI instructions.
394	OK, that makes sense. I suppose you want to make sure that you skip anything that comes between `begin()` and the first PHI node in the `ToMBB`. But I guess what I'm confused about is exactly what this function is meant to do. It appears that it will do the following: Iterate over all the PHI nodes in MBB called `FromMBB` If there are any PHI nodes that refer to the same MBB, they'll be updated so they refer to the MBB called `ToMBB` Then each of those PHI nodes will be moved to MBB called `ToMBB` before any existing PHI nodes in `ToMBB` However, what happens to any PHI nodes in `ToMBB` that refer to `FromMBB`? They don't need to be updated?
467	Yes, but please add asserts to ensure this function isn't used incorrectly in the future. It is dangerous for a function called `canMoveTo` to return true when an instruction can't safely move to a basic block.
609	The two are not mutually exclusive. Also, this doesn't answer whether `From.BranchBlock` needs to dominate `To.FallThroughBlock`.
656	Sure. Now there is. I'd still prefer an assert. If someone in the future feels that `analyzeBranch()` doesn't need that early exit any longer, they'll need to ensure they account for this assert and why it's in this function.
730	Setting a bool to true and then using it in an if statement without any statements that could modify it in between is redundant. Besides, I think it's overly verbose to dump the entire function after every merge - I'd imagine that a dump after all the merging is done would suffice.

nemanjai added inline comments.Jan 23 2017, 9:54 AM

lib/CodeGen/BranchCoalescing.cpp
345	Sorry, the two `return true;` lines should be `continue;` so that you keep checking the remainder of the lists rather than returning true if the first pair are identical.

lei marked 12 inline comments as done.Jan 23 2017, 10:19 AM

lei added inline comments.

lib/CodeGen/BranchCoalescing.cpp

345

Minor change so that we are doing early exit when operands within the operand lists are detected to be diff.

for (unsigned i = 0; i < OpList1.size(); ++i) {
   const MachineOperand &Op1 = OpList1[i];
   const MachineOperand &Op2 = OpList2[i];

   DEBUG(dbgs() << "Op1: " << Op1 << "\n"
                << "Op2: " << Op2 << "\n");

   if (Op1.isIdenticalTo(Op2)) {
     DEBUG(dbgs() << "Op1 and Op2 are identical!\n");
     continue;
   }

   // If the operands are not identical, but are registers, check to see if the
   // definition of the register produces the same value. If they produce the
   // same value, consider them to be identical.
   if (Op1.isReg() && Op2.isReg() &&
       TargetRegisterInfo::isVirtualRegister(Op1.getReg()) &&
       TargetRegisterInfo::isVirtualRegister(Op2.getReg())) {
     MachineInstr *Op1Def = MRI->getVRegDef(Op1.getReg());
     MachineInstr *Op2Def = MRI->getVRegDef(Op2.getReg());
     if (TII->produceSameValue(*Op1Def, *Op2Def, MRI)) {
       DEBUG(dbgs() << "Op1Def: " << *Op1Def << " and " << *Op2Def
                    << " produce the same value!\n");
     } else {
       DEBUG(dbgs() << "Operands produce different values\n");
       return false;
     }
   } else {
     DEBUG(dbgs() << "The operands are not provably identical.\n");
     return false;
   }
 }

357

changes in previous comment will address this issue as well.

467

We always return false when an instruction isn't safe to remove. This function just does a check to see if it's valid to move an instruction to the BB at the specified location. If won't work if it asserts when it can't be moved.

lei marked 6 inline comments as done.Jan 23 2017, 10:48 AM

lei added inline comments.

lib/CodeGen/BranchCoalescing.cpp
379	will remove function and add loop to location where it's being used.
387	Yes. They should be inserted before existing PHIs. Will update the doc.
394	This function moves PHI nodes in the FromMBB to the beginning of the PHI block in ToMBB. If any PHI nodes in ToMBB reference registers defined in PHI nodes in FromMBB, the PHI nodes in FromMBB can not be moved to ToMBB since we can not infer order in the PHI node execution.

lei marked 8 inline comments as done.Jan 23 2017, 10:50 AM

lei added inline comments.

lib/CodeGen/BranchCoalescing.cpp
345	see code snippet above.

lei marked an inline comment as done.Jan 24 2017, 12:34 PM

lei added inline comments.

lib/CodeGen/BranchCoalescing.cpp
656	okay
730	For debug purposes, it would be nice to see the stages of merging... Will remove the if statement.

lei marked 3 inline comments as done.Jan 24 2017, 12:50 PM

lei added inline comments.

lib/CodeGen/BranchCoalescing.cpp
379	Actually according to the llvm lang ref, PHI instructions must be first in a basic block. Will remove this function and replace the function call below with MBB->begin()

lei added inline comments.Jan 24 2017, 12:50 PM

lib/CodeGen/BranchCoalescing.cpp
410	will update to use range version of splice()

lei marked an inline comment as done.Jan 24 2017, 9:01 PM

lei added inline comments.

lib/CodeGen/BranchCoalescing.cpp
609	The same kind of checks are needed for function canMerge and mergeCandidate, Put the checks into function validateCandidates() and call it to verify MBBs and do an early exit if returns false.

lei marked an inline comment as done.Jan 25 2017, 8:46 AM

lei added inline comments.

lib/CodeGen/BranchCoalescing.cpp
443	will remove this redundant check since MI.defs() return register definitions.

Updated patch to address all the review comments from Nemanja.

Add guard to only verify MF which this pass have modified.

Just a friendly reminder that this patch is waiting to be reviewed. Thanks!

Aside from a couple minor comments, this LGTM.

It would be good if @hfinkel or @echristo could take a look also.

lib/CodeGen/BranchCoalescing.cpp
120	Branch Coalescing not Branch Coalesce
754	Did we conclude whether this is different from the checks done by the verifyAfter parameter in addPass?

This revision is now accepted and ready to land.Feb 8 2017, 12:10 PM

I'll look today.

In general pretty happy with this. A few comments inline.

A few top level comments:

What performance testing have you done with this?
Looks like SPEC on ppc?
Anything else?
Same for correctness checking. You mentioned that this doesn't affect ARM and then change a thumb testcase, what's up there?

Thanks!

-eric

lib/CodeGen/BranchCoalescing.cpp
32	Let's have this default to off for the initial commit and then we can turn it on in a subsequent one.
231	Probably should be something like "canCoalesceBranch" instead of analyzeBranch. All of the analysis is being done by analyzeBranch really and this is actually figuring out whether or not we should coalesce.
253	Meta comment on the DEBUG statements. I like them, but it might be nice to give a summary of what you're analyzing here before all of the "can't do it" statements.
290–304	I believe the case you're looking for here is also whether or not there's a single fall through? Might be nice to reorganize the code with that in mind.
364	Could use an explanation of why - it confused Nemanja at first and could confuse others.
613	llvm_unreachable
643	Nit: All comments should be complete sentences. (A few other occurrences)
664–667	This set of comments is very confusing.

In D28249#672204, @echristo wrote:

In general pretty happy with this. A few comments inline.

A few top level comments:

What performance testing have you done with this?
Looks like SPEC on ppc?
Anything else?
Same for correctness checking. You mentioned that this doesn't affect ARM and then change a thumb testcase, what's up there?

Thanks!

-eric

Yes, the only performance tests done was SPEC on PPC.

Seems I was testing with the wrong triple for ARM. The -mtriple I was testing with was "-mtriple=armv6-unknown-linux-gnu" which generated a pattern that we do not recognize. The test case below uses
-mtriple=thumb-apple-darwin and -mtriple=thumb-pc-linux-gnueabi. The select statements for those triples does generate the pattern we are looking for.

I have updated the batch to be turned off by default.

lei added inline comments.Feb 13 2017, 1:03 PM

lib/CodeGen/BranchCoalescing.cpp
32	okay
120	okay
231	true
253	okay

lei marked 5 inline comments as done.Feb 20 2017, 8:33 AM

lei added inline comments.

lib/CodeGen/BranchCoalescing.cpp
364	Sine PHI node ordering can not be assumed, it doesn't really matter where we place the PHI instructions. Will update comment to reflect this.

lei marked 5 inline comments as done.Feb 20 2017, 10:03 AM

lei added inline comments.

lib/CodeGen/BranchCoalescing.cpp
664–667	okay

Because select gets converted to branches in PowerPC, I was wondering if there is a way to not generate selects in the first place.

lei marked 5 inline comments as done.Feb 21 2017, 10:11 AM

lei added inline comments.

lib/CodeGen/BranchCoalescing.cpp
613	okay

lei marked 3 inline comments as done.Feb 21 2017, 10:43 AM

lei added inline comments.

lib/CodeGen/BranchCoalescing.cpp
643	k.
754	Yes. This is different from the checks done by the verifyAfter parameter in addPass. The verifyAfter parameter for this pass is set to true by default.

lei marked 4 inline comments as done.Feb 21 2017, 11:00 AM

In D28249#681685, @hiraditya wrote:

Because select gets converted to branches in PowerPC, I was wondering if there is a way to not generate selects in the first place.

It is not true in general that selects get converted to branches on PowerPC. For example, integer selects can be lowered into an actual "integer select" instruction under the "right" circumstances.
I'm sure there is a way to prevent the selects from being produced, but I am not sure this would have any effect on this optimization. For the specific test case that is added as part of this patch, it appears that SROA produces the selects. I haven't personally looked at the semantics of SROA and the heuristics it uses to produce a select or not, but there is presumably a way to tell it not to do so.

In D28249#682567, @nemanjai wrote:

In D28249#681685, @hiraditya wrote:

Because select gets converted to branches in PowerPC, I was wondering if there is a way to not generate selects in the first place.

It is not true in general that selects get converted to branches on PowerPC. For example, integer selects can be lowered into an actual "integer select" instruction under the "right" circumstances.
I'm sure there is a way to prevent the selects from being produced, but I am not sure this would have any effect on this optimization. For the specific test case that is added as part of this patch, it appears that SROA produces the selects. I haven't personally looked at the semantics of SROA and the heuristics it uses to produce a select or not, but there is presumably a way to tell it not to do so.

In general, we want to produce selects rather than explicit branching due to superblock formation etc. On Power9 we'll find that the integer select instruction is also much faster than on Power8 and so will experience less of a penalty as well.

You marked things done that aren't done. What's going on? :)

-eric

lib/CodeGen/BranchCoalescing.cpp
231	You've marked this done, but it's not done?

lei marked an inline comment as done.Feb 22 2017, 11:49 AM

lei added inline comments.

lib/CodeGen/BranchCoalescing.cpp
290–304	okay.

In D28249#682608, @echristo wrote:

You marked things done that aren't done. What's going on? :)

-eric

Someone told me no one looked at that so I was using it to mark what I have done in my version ... which is not uploaded yet :) Will stop doing that now that I know it is actually being noted on.

lei marked an inline comment as not done.Feb 22 2017, 11:53 AM

lei added inline comments.

lib/CodeGen/BranchCoalescing.cpp
290–304	Will reorganize this.

In D28249#683815, @lei wrote:

In D28249#682608, @echristo wrote:

You marked things done that aren't done. What's going on? :)

-eric

Someone told me no one looked at that so I was using it to mark what I have done in my version ... which is not uploaded yet :) Will stop doing that now that I know it is actually being noted on.

Thanks :)

I very much look at it to make sure everything I asked for was done before coming back to reviews.

Address review comments.

In D28249#683829, @echristo wrote:

In D28249#683815, @lei wrote:

In D28249#682608, @echristo wrote:

You marked things done that aren't done. What's going on? :)

-eric

Someone told me no one looked at that so I was using it to mark what I have done in my version ... which is not uploaded yet :) Will stop doing that now that I know it is actually being noted on.

Thanks :)

I very much look at it to make sure everything I asked for was done before coming back to reviews.

Now it's really done :)

Bunch of inline comments.

Thanks!

-eric

include/llvm/CodeGen/Passes.h
403	Nit: Complete sentences in comments please.
lib/CodeGen/BranchCoalescing.cpp
353	"beginning"
368	"contains"
379	I don't think this has ever been changed, but I'm not against a separate function. That said, doesn't transferSuccessorsAndUpdatePHIs already do this? Also, this loop needs documentation :)
393–394	This seems like an odd comment.
394	Should probably assert this somewhere.
405	I think it makes just as much sense (and is more readable) if you split the function into canMoveToBeginning and canMoveToEnd and just drop the boolean parameter. It'll also make it easier to split the docs.
454–467	Possible you just want to replace this with a bunch of llvm_unreachables?
480	Why?
488	Sounds like an assert.
517	Seems like you want to swap these conditionals?
535	else if? Otherwise what happens?
758	This was to be turned on in a subsequent patch. Remove this and the ppc support please :)

lei added inline comments.Feb 24 2017, 8:35 AM

include/llvm/CodeGen/Passes.h
403	okay
lib/CodeGen/BranchCoalescing.cpp
379	It's because here we only want to move the PHIs from SourceMBB to TargetMBB. SourceMBB will be deleted later. We don't want to transfer the successor info here. Control flow need to be updated once we have deleted SourceMBB.
393–394	A MI instruction can only be moved to TargetMBB if there are no uses of it within the TargetMBB's PHI nodes. Will reword the comment.
394	We don't assert cause we first check one direction (canMoveToBeginning) and then check the other direction (canMoveToEnd). If both fails, we just don't move the instruction and move to investigate the next one.
405	agreed.
454–467	yup
480	Maybe preference is the wrong word here... this is by design. Will update the doc
488	we do assert on that, will update the doc here.
535	No else if. We check conditions to move up and down separately and verify that only one direction is valid.

Address review comments.

lei marked 12 inline comments as done.Feb 24 2017, 10:54 PM

lei added inline comments.

lib/CodeGen/BranchCoalescing.cpp
758	see line 161

lei marked an inline comment as done.Feb 24 2017, 10:56 PM

LGTM.

Thanks! Now let's get some performance testing and go from there.

-eric

Closed by commit rL296670: Improve scheduling with branch coalescing (authored by nemanjai). · Explain WhyMar 1 2017, 12:41 PM

This revision was automatically updated to reflect the committed changes.

Just noticed we have a new pass in the codegen pipeline and wondered what it is about. Some comments:

The description/examples talk about the same branch condition in the IR but the IR doesn't even have branches and this is an MI pass, not an IR pass.
When I codegen the given example on X86 I do indeed see silly code getting generated because isel chooses a "CMOV_FR64" for each of the select instructions which "Expand ISel Pseudo-instructions" later expands into 3 if-diamonds that all have the same condition.
It looks like we created a whole new pass to fix Expand ISel Pseudo begin stupid?
This is a generic codegen pass but the description here indicates it only ever happens to match the patterns on PowerPC?

Hi Matthias,

I'll let Lei talk to more of it, however...

In D28249#723145, @MatzeB wrote:

Just noticed we have a new pass in the codegen pipeline and wondered what it is about. Some comments:

The description/examples talk about the same branch condition in the IR but the IR doesn't even have branches and this is an MI pass, not an IR pass.

Would you prefer s/IR/MIR? :)

That said, it's definitely meant to be an MI pass.

When I codegen the given example on X86 I do indeed see silly code getting generated because isel chooses a "CMOV_FR64" for each of the select instructions which "Expand ISel Pseudo-instructions" later expands into 3 if-diamonds that all have the same condition.

Yep. The arm and mips backends also do the same thing.

It looks like we created a whole new pass to fix Expand ISel Pseudo begin stupid?

I agree up to a point, though I don't have any particular ideas here how to fix it. My thought was "let the backends expand into the easiest code as possible and then clean it up with a generic pass", but I'm definitely open to different ideas.

This is a generic codegen pass but the description here indicates it only ever happens to match the patterns on PowerPC?

It's not so much that as the original patch was on by default and required changes to the ARM testcases, but the motivating example was particularly poor code on Power. We could probably throw some extra tests in, but it seems like a stretch to make all of the examples quite so universal? The code itself, of course, has no backend dependencies or even TTI.

Thoughts?

-eric

To say this first: I'm fine with the current implementation.

In D28249#739046, @echristo wrote:

Hi Matthias,

I'll let Lei talk to more of it, however...

In D28249#723145, @MatzeB wrote:

Just noticed we have a new pass in the codegen pipeline and wondered what it is about. Some comments:

The description/examples talk about the same branch condition in the IR but the IR doesn't even have branches and this is an MI pass, not an IR pass.

Would you prefer s/IR/MIR? :)

This comment was mostly about the description of this patch which was confusing as it only talked about IR. Actually looking at the code the documentation comment above the class is a lot better and explains the problem just fine.

That said, it's definitely meant to be an MI pass.

When I codegen the given example on X86 I do indeed see silly code getting generated because isel chooses a "CMOV_FR64" for each of the select instructions which "Expand ISel Pseudo-instructions" later expands into 3 if-diamonds that all have the same condition.

Yep. The arm and mips backends also do the same thing.

It looks like we created a whole new pass to fix Expand ISel Pseudo begin stupid?

I agree up to a point, though I don't have any particular ideas here how to fix it. My thought was "let the backends expand into the easiest code as possible and then clean it up with a generic pass", but I'm definitely open to different ideas.

Again no hard rule. My intuition would be that we would get by with less complexity and less code by improving expandisel pseudos; I would expect that matching the pattern and checking for correctness should be quite a bit easier there. But this isn't a big issue and now that Lei already went through the trouble of creating a generic pass we may just as well keep it.

This is a generic codegen pass but the description here indicates it only ever happens to match the patterns on PowerPC?

It's not so much that as the original patch was on by default and required changes to the ARM testcases, but the motivating example was particularly poor code on Power. We could probably throw some extra tests in, but it seems like a stretch to make all of the examples quite so universal? The code itself, of course, has no backend dependencies or even TTI.

Please ignore this comment. I was mainly triggered by "After Branch Coalescing" showing up in -print-after-all dumps but the pass never changing anything and the comments above about this only working on powerpc; Turns out the pass just checks the enable-branch-coalesce isn't passed as a commandline flag. I think most passes have a cl::opt in TargetPassConfig.cpp instead and we don't even add the pass the pipeline if it is disable, you could change to that style too to avoid people seeing the dump for a disabled pass :)

Good to hear this is a generic pass, I am pretty sure the code equally bad on the other targets and it is good that we fixed it. I'll add a ticket into our bugtracker to experiment/evaluate on X86/AArch64.

Matthias

I think most passes have a cl::opt in TargetPassConfig.cpp instead and we don't even add the pass the pipeline if it is disable, you could change to that style too to avoid people seeing the dump for a disabled pass :)

Will address this in the next patch to turn this on by default for PowerPC.
Thanks

iteratee added a subscriber: iteratee.Aug 30 2017, 3:47 PM

iteratee added inline comments.

llvm/trunk/lib/CodeGen/BranchCoalescing.cpp

340 ↗

(On Diff #90219)

This line isn't correct. I have a case where instructions that "produce the same value" are different. The relevant sequence is:

BB#16: 
...
        BCTRL8_LDinto_toc
...
        %vreg140<def> = COPY %CR0GT; CRBITRC:%vreg140
        %vreg141<def> = LXSDX %vreg138, %vreg129, %RM<imp-use>; mem:LD8[%134](dereferenceable) F8RC:%vreg141 G8RC_and_G8RC_NOX0:%vreg138 G8RC:%vreg129
        %vreg142<def> = XXLXORdpz; F8RC:%vreg142
        BC %vreg140, <BB#73>; CRBITRC:%vreg140

BB#72: derived from LLVM BB %114
    Predecessors according to CFG: BB#16
    Successors according to CFG: BB#73(?%)

BB#73: derived from LLVM BB %114
    Predecessors according to CFG: BB#16 BB#72
        %vreg143<def> = PHI %vreg142, <BB#72>, %vreg141, <BB#16>; F8RC:%vreg143,%vreg142,%vreg141
...
        BCTRL8_LDinto_toc
...
        %vreg149<def> = COPY %CR0GT; CRBITRC:%vreg149
        %vreg150<def> = LXSDX %vreg138, %vreg129, %RM<imp-use>; mem:LD8[%134](dereferenceable) F8RC:%vreg150 G8RC_and_G8RC_NOX0:%vreg138 G8RC:%vreg129
        BC %vreg149, <BB#75>; CRBITRC:%vreg149
    Successors according to CFG: BB#74(?%) BB#75(?%)

BB#74: derived from LLVM BB %114
    Predecessors according to CFG: BB#73
    Successors according to CFG: BB#75(?%)

BB#75: derived from LLVM BB %114
    Predecessors according to CFG: BB#73 BB#74
        %vreg151<def> = PHI %vreg142, <BB#74>, %vreg150, <BB#73>; F8RC:%vreg151,%vreg142,%vreg150

The debug output produces:

Valid Candidate
Op1: 1024
Op2: 1024
Op1 and Op2 are identical!
Op1: %vreg140
Op2: %vreg149
Op1Def: %vreg140<def> = COPY %CR0GT; CRBITRC:%vreg140
 and %vreg149<def> = COPY %CR0GT; CRBITRC:%vreg149
 produce the same value!

While it would be safe to CSE those crmoves, what definitely cannot occur is to assume that the value of CR0GT has not changed between the 2 instructions.

hfinkel added inline comments.Aug 30 2017, 4:33 PM

llvm/trunk/lib/CodeGen/BranchCoalescing.cpp

340 ↗

(On Diff #90219)

Indeed; I think that we need to filter out instructions with physical-register uses (even if identical). We might be able to be a little smarter about it in a similar way to MachineLICM (the other uses of this function), which does this:

// Don't hoist an instruction that uses or defines a physical register.
if (TargetRegisterInfo::isPhysicalRegister(Reg)) {
  if (MO.isUse()) {
    // If the physreg has no defs anywhere, it's just an ambient register
    // and we can freely move its uses. Alternatively, if it's allocatable,
    // it could get allocated to something with a def during allocation.
    // However, if the physreg is known to always be caller saved/restored
    // then this use is safe to hoist.
    if (!MRI->isConstantPhysReg(Reg) &&
        !(TRI->isCallerPreservedPhysReg(Reg, *I.getParent()->getParent())))
        return false;
    // Otherwise it's safe to move.
    continue;
  } else if (!MO.isDead()) {
    // A def that isn't dead. We can't move it.
    return false;
  ...

llvm/trunk/lib/CodeGen/BranchCoalescing.cpp
340 ↗	(On Diff #90219)	Addressing this in https://reviews.llvm.org/D32776

Revision Contents

Path

Size

include/

llvm/

CodeGen/

Passes.h

3 lines

InitializePasses.h

1 line

lib/

CodeGen/

760 lines

1 line

1 line

4 lines

test/

CodeGen/

PowerPC/

branch_coalesce.ll

31 lines

select-i1-vs-i1.ll

9 lines

Thumb/

select.ll

4 lines

Diff 85784

include/llvm/CodeGen/Passes.h

Show First 20 Lines • Show All 393 Lines • ▼ Show 20 Lines	/// MachineDominanaceFrontier - This pass is a machine dominators analysis pass.
/// if available with PysicalRegisterUsageInfo pass.		/// if available with PysicalRegisterUsageInfo pass.
FunctionPass *createRegUsageInfoPropPass();		FunctionPass *createRegUsageInfoPropPass();

/// This pass performs software pipelining on machine instructions.		/// This pass performs software pipelining on machine instructions.
extern char &MachinePipelinerID;		extern char &MachinePipelinerID;

/// This pass frees the memory occupied by the MachineFunction.		/// This pass frees the memory occupied by the MachineFunction.
FunctionPass *createFreeMachineFunctionPass();		FunctionPass *createFreeMachineFunctionPass();

		/// Branch Coalescing - combine basic blocks guarded by the same branch
		echristoUnsubmitted Done Reply Inline Actions Nit: Complete sentences in comments please. echristo: Nit: Complete sentences in comments please.
		leiAuthorUnsubmitted Not Done Reply Inline Actions okay lei: okay
		extern char &BranchCoalescingID;
} // End llvm namespace		} // End llvm namespace

/// Target machine pass initializer for passes with dependencies. Use with		/// Target machine pass initializer for passes with dependencies. Use with
/// INITIALIZE_TM_PASS_END.		/// INITIALIZE_TM_PASS_END.
#define INITIALIZE_TM_PASS_BEGIN INITIALIZE_PASS_BEGIN		#define INITIALIZE_TM_PASS_BEGIN INITIALIZE_PASS_BEGIN

/// Target machine pass initializer for passes with dependencies. Use with		/// Target machine pass initializer for passes with dependencies. Use with
/// INITIALIZE_TM_PASS_BEGIN.		/// INITIALIZE_TM_PASS_BEGIN.
Show All 24 Lines

include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines
	void initializeAtomicExpandPass(PassRegistry&);			void initializeAtomicExpandPass(PassRegistry&);
	void initializeBBVectorizePass(PassRegistry&);			void initializeBBVectorizePass(PassRegistry&);
	void initializeBDCELegacyPassPass(PassRegistry &);			void initializeBDCELegacyPassPass(PassRegistry &);
	void initializeBarrierNoopPass(PassRegistry&);			void initializeBarrierNoopPass(PassRegistry&);
	void initializeBasicAAWrapperPassPass(PassRegistry&);			void initializeBasicAAWrapperPassPass(PassRegistry&);
	void initializeBlockExtractorPassPass(PassRegistry&);			void initializeBlockExtractorPassPass(PassRegistry&);
	void initializeBlockFrequencyInfoWrapperPassPass(PassRegistry&);			void initializeBlockFrequencyInfoWrapperPassPass(PassRegistry&);
	void initializeBoundsCheckingPass(PassRegistry&);			void initializeBoundsCheckingPass(PassRegistry&);
				void initializeBranchCoalescingPass(PassRegistry&);
	void initializeBranchFolderPassPass(PassRegistry&);			void initializeBranchFolderPassPass(PassRegistry&);
	void initializeBranchProbabilityInfoWrapperPassPass(PassRegistry&);			void initializeBranchProbabilityInfoWrapperPassPass(PassRegistry&);
	void initializeBranchRelaxationPass(PassRegistry&);			void initializeBranchRelaxationPass(PassRegistry&);
	void initializeBreakCriticalEdgesPass(PassRegistry&);			void initializeBreakCriticalEdgesPass(PassRegistry&);
	void initializeCFGOnlyViewerLegacyPassPass(PassRegistry&);			void initializeCFGOnlyViewerLegacyPassPass(PassRegistry&);
	void initializeCFGPrinterLegacyPassPass(PassRegistry&);			void initializeCFGPrinterLegacyPassPass(PassRegistry&);
	void initializeCFGOnlyPrinterLegacyPassPass(PassRegistry&);			void initializeCFGOnlyPrinterLegacyPassPass(PassRegistry&);
	void initializeCFGSimplifyPassPass(PassRegistry&);			void initializeCFGSimplifyPassPass(PassRegistry&);
	▲ Show 20 Lines • Show All 275 Lines • Show Last 20 Lines

lib/CodeGen/BranchCoalescing.cpp

This file was added.

				//===-- CoalesceBranches.cpp - Coalesce blocks with the same condition ---===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				///
				/// \file
				/// Coalesce basic blocks guarded by the same branch condition into a single
				/// basic block.
				///
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/BitVector.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/CodeGen/MachineDominators.h"
				#include "llvm/CodeGen/MachineFunctionPass.h"
				#include "llvm/CodeGen/MachinePostDominators.h"
				#include "llvm/CodeGen/MachineRegisterInfo.h"
				#include "llvm/CodeGen/Passes.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Target/TargetInstrInfo.h"
				#include "llvm/Target/TargetSubtargetInfo.h"

				using namespace llvm;

				#define DEBUG_TYPE "coal-branch"

				static cl::opt<cl::boolOrDefault>
				EnableBranchCoalescing("enable-branch-coalesce", cl::Hidden,
				echristoUnsubmitted Done Reply Inline Actions Let's have this default to off for the initial commit and then we can turn it on in a subsequent one. echristo: Let's have this default to off for the initial commit and then we can turn it on in a…
				leiAuthorUnsubmitted Done Reply Inline Actions okay lei: okay
				cl::desc("enable coalescing of duplicate branches"));

				STATISTIC(NumBlocksCoalesced, "Number of blocks coalesced");
				STATISTIC(NumPHINotMoved, "Number of PHI Nodes that cannot be merged");
				STATISTIC(NumBlocksNotCoalesced, "Number of blocks not coalesced");

				//===----------------------------------------------------------------------===//
				// BranchCoalescing
				//===----------------------------------------------------------------------===//
				///
				/// Improve scheduling by coalescing branches that depend on the same condition.
				/// This pass looks for blocks that are guarded by the same branch condition
				/// and attempts to merge the blocks together. Such opportunities arise from
				/// the expansion of select statements in the IR.
				///
				/// For example, consider the following LLVM IR:
				///
				/// %test = icmp eq i32 %x 0
				/// %tmp1 = select i1 %test, double %a, double 2.000000e-03
				/// %tmp2 = select i1 %test, double %b, double 5.000000e-03
				///
				/// This IR expands to the following machine code on PowerPC:
				///
				/// BB#0: derived from LLVM BB %entry
				/// Live Ins: %F1 %F3 %X6
				/// <SNIP1>
				/// %vreg0<def> = COPY %F1; F8RC:%vreg0
				/// %vreg5<def> = CMPLWI %vreg4<kill>, 0; CRRC:%vreg5 GPRC:%vreg4
				/// %vreg8<def> = LXSDX %ZERO8, %vreg7<kill>, %RM<imp-use>;
				/// mem:LD8[ConstantPool] F8RC:%vreg8 G8RC:%vreg7
				/// BCC 76, %vreg5, <BB#2>; CRRC:%vreg5
				/// Successors according to CFG: BB#1(?%) BB#2(?%)
				///
				/// BB#1: derived from LLVM BB %entry
				/// Predecessors according to CFG: BB#0
				/// Successors according to CFG: BB#2(?%)
				///
				/// BB#2: derived from LLVM BB %entry
				/// Predecessors according to CFG: BB#0 BB#1
				/// %vreg9<def> = PHI %vreg8, <BB#1>, %vreg0, <BB#0>;
				/// F8RC:%vreg9,%vreg8,%vreg0
				/// <SNIP2>
				/// BCC 76, %vreg5, <BB#4>; CRRC:%vreg5
				/// Successors according to CFG: BB#3(?%) BB#4(?%)
				///
				/// BB#3: derived from LLVM BB %entry
				/// Predecessors according to CFG: BB#2
				/// Successors according to CFG: BB#4(?%)
				///
				/// BB#4: derived from LLVM BB %entry
				/// Predecessors according to CFG: BB#2 BB#3
				/// %vreg13<def> = PHI %vreg12, <BB#3>, %vreg2, <BB#2>;
				/// F8RC:%vreg13,%vreg12,%vreg2
				/// <SNIP3>
				/// BLR8 %LR8<imp-use>, %RM<imp-use>, %F1<imp-use>
				///
				/// When this pattern is detected, branch coalescing will try to collapse
				/// it by moving code in BB#2 to BB#0 and/or BB#4 and removing BB#3.
				///
				/// If all conditions are meet, IR should collapse to:
				///
				/// BB#0: derived from LLVM BB %entry
				/// Live Ins: %F1 %F3 %X6
				/// <SNIP1>
				/// %vreg0<def> = COPY %F1; F8RC:%vreg0
				/// %vreg5<def> = CMPLWI %vreg4<kill>, 0; CRRC:%vreg5 GPRC:%vreg4
				/// %vreg8<def> = LXSDX %ZERO8, %vreg7<kill>, %RM<imp-use>;
				/// mem:LD8[ConstantPool] F8RC:%vreg8 G8RC:%vreg7
				/// <SNIP2>
				/// BCC 76, %vreg5, <BB#4>; CRRC:%vreg5
				/// Successors according to CFG: BB#1(0x2aaaaaaa / 0x80000000 = 33.33%)
				/// BB#4(0x55555554 / 0x80000000 = 66.67%)
				///
				/// BB#1: derived from LLVM BB %entry
				/// Predecessors according to CFG: BB#0
				/// Successors according to CFG: BB#4(0x40000000 / 0x80000000 = 50.00%)
				///
				/// BB#4: derived from LLVM BB %entry
				/// Predecessors according to CFG: BB#0 BB#1
				/// %vreg9<def> = PHI %vreg8, <BB#1>, %vreg0, <BB#0>;
				/// F8RC:%vreg9,%vreg8,%vreg0
				/// %vreg13<def> = PHI %vreg12, <BB#1>, %vreg2, <BB#0>;
				/// F8RC:%vreg13,%vreg12,%vreg2
				/// <SNIP3>
				/// BLR8 %LR8<imp-use>, %RM<imp-use>, %F1<imp-use>
				///
				/// Branch Coalesce does not split blocks, it moves everything in the same
				/// direction ensuring it does not break use/definition semantics.
				kbartonUnsubmitted Done Reply Inline Actions Branch Coalescing not Branch Coalesce kbarton: Branch Coalescing not Branch Coalesce
				leiAuthorUnsubmitted Done Reply Inline Actions okay lei: okay
				///
				/// PHI nodes and its corresponding use instructions are moved to its successor
				/// block if there are no uses within the successor block PHI nodes. PHI
				/// node ordering cannot be assumed.
				///
				/// Non-PHI can be moved up to the predecessor basic block or down to the
				/// successor basic block following any PHI instructions. Whether it moves
				/// up or down depends on whether the register(s) defined in the instructions
				/// are used in current block or in any PHI instructions at the beginning of
				/// the successor block.

				namespace {

				class BranchCoalescing : public MachineFunctionPass {
				struct CoalescingCandidateInfo {
				MachineBasicBlock *BranchBlock; //< Block containing the branch
				MachineBasicBlock *BranchTargetBlock; //< Block branched to
				MachineBasicBlock *FallThroughBlock; //< Fall-through if branch not taken
				SmallVector<MachineOperand, 4> Cond;
				bool MustMoveDown;
				bool MustMoveUp;

				CoalescingCandidateInfo();
				void clear();
				};

				MachineDominatorTree *MDT;
				MachinePostDominatorTree *MPDT;
				const TargetInstrInfo *TII;
				MachineRegisterInfo *MRI;

				void initialize(MachineFunction &F);
				bool analyzeBranch(CoalescingCandidateInfo &Cand);
				bool identicalOperands(ArrayRef<MachineOperand> OperandList1,
				ArrayRef<MachineOperand> OperandList2) const;
				bool validateCandidates(CoalescingCandidateInfo &SourceRegion,
				CoalescingCandidateInfo &TargetRegion) const;

				static bool isBranchCoalescingEnabled() {
				return EnableBranchCoalescing != cl::BOU_FALSE;
				}

				public:
				static char ID;

				BranchCoalescing() : MachineFunctionPass(ID) {
				initializeBranchCoalescingPass(*PassRegistry::getPassRegistry());
				}

				void getAnalysisUsage(AnalysisUsage &AU) const override {
				AU.addRequired<MachineDominatorTree>();
				AU.addRequired<MachinePostDominatorTree>();
				MachineFunctionPass::getAnalysisUsage(AU);
				}

				StringRef getPassName() const override { return "Branch Coalescing"; }

				bool mergeCandidates(CoalescingCandidateInfo &SourceRegion,
				CoalescingCandidateInfo &TargetRegion);
				bool canMoveTo(const MachineInstr &MI, const MachineBasicBlock &MBB,
				bool MoveToBeginning) const;
				bool canMerge(CoalescingCandidateInfo &SourceRegion,
				CoalescingCandidateInfo &TargetRegion) const;
				void moveAndUpdatePHIs(MachineBasicBlock *SourceRegionMBB,
				MachineBasicBlock *TargetRegionMBB);
				bool runOnMachineFunction(MachineFunction &MF) override;
				};
				} // End anonymous namespace.

				char BranchCoalescing::ID = 0;
				char &llvm::BranchCoalescingID = BranchCoalescing::ID;

				INITIALIZE_PASS_BEGIN(BranchCoalescing, "branch-coalescing",
				"Branch Coalescing", false, false)
				INITIALIZE_PASS_DEPENDENCY(MachineDominatorTree)
				INITIALIZE_PASS_DEPENDENCY(MachinePostDominatorTree)
				INITIALIZE_PASS_END(BranchCoalescing, "branch-coalescing", "Branch Coalescing",
				false, false)

				BranchCoalescing::CoalescingCandidateInfo::CoalescingCandidateInfo()
				: BranchBlock(nullptr), BranchTargetBlock(nullptr),
				FallThroughBlock(nullptr), MustMoveDown(false), MustMoveUp(false) {}

				void BranchCoalescing::CoalescingCandidateInfo::clear() {
				BranchBlock = nullptr;
				BranchTargetBlock = nullptr;
				FallThroughBlock = nullptr;
				Cond.clear();
				MustMoveDown = false;
				MustMoveUp = false;
				}

				void BranchCoalescing::initialize(MachineFunction &MF) {
				MDT = &getAnalysis<MachineDominatorTree>();
				MPDT = &getAnalysis<MachinePostDominatorTree>();
				TII = MF.getSubtarget().getInstrInfo();
				MRI = &MF.getRegInfo();
				}
				nemanjaiUnsubmitted Done Reply Inline Actions We want to reset the stats for every function? I would have assumed we're interested in collecting this for the entire module. nemanjai: We want to reset the stats for every function? I would have assumed we're interested in…
				leiAuthorUnsubmitted Done Reply Inline Actions Initialization removed. lei: Initialization removed.

				///
				/// Analyze the branch statement to determine if it can be coalesced. This
				/// method analyses the branch statement for the given candidate to determine
				/// if it can be coalesced. If the branch can be coalesced, then the
				/// BranchTargetBlock and the FallThroughBlock are recorded in the specified
				/// Candidate.
				///
				///\param[in,out] Cand The coalescing candidate to analyze
				///\return true if and only if the branch can be coalesced, false otherwise
				nemanjaiUnsubmitted Done Reply Inline Actions I am personally not a fan of functions that modify an output parameter and then fail. Presumably this makes no difference now, but if we end up extending this pass in the future to have some backtracking/requeuing capabilities, it would be nice if the object wasn't modified/invalidated in some way. nemanjai: I am personally not a fan of functions that modify an output parameter and then fail.
				leiAuthorUnsubmitted Done Reply Inline Actions agreed. lei: agreed.
				///
				bool BranchCoalescing::analyzeBranch(CoalescingCandidateInfo &Cand) {
				DEBUG(dbgs() << "Analyzing branch for block " << Cand.BranchBlock->getNumber()
				echristoUnsubmitted Done Reply Inline Actions Probably should be something like "canCoalesceBranch" instead of analyzeBranch. All of the analysis is being done by analyzeBranch really and this is actually figuring out whether or not we should coalesce. echristo: Probably should be something like "canCoalesceBranch" instead of analyzeBranch. All of the…
				leiAuthorUnsubmitted Done Reply Inline Actions true lei: true
				echristoUnsubmitted Done Reply Inline Actions You've marked this done, but it's not done? echristo: You've marked this done, but it's not done?
				<< ": ");
				MachineBasicBlock *FalseMBB = nullptr;

				if (TII->analyzeBranch(*Cand.BranchBlock, Cand.BranchTargetBlock, FalseMBB,
				Cand.Cond)) {
				DEBUG(dbgs() << "TII unable to Analyze Branch - skip\n");
				return false;
				}

				for (auto &I : Cand.BranchBlock->terminators()) {
				DEBUG(dbgs() << "Looking at terminator : " << I << "\n");
				if (!I.isBranch())
				continue;

				if (I.getNumOperands() != I.getNumExplicitOperands()) {
				DEBUG(dbgs() << "Terminator contains implicit operands - skip : " << I
				<< "\n");
				return false;
				}
				}

				// For now only consider triangles (i.e, BranchTargetBlock is set,
				echristoUnsubmitted Done Reply Inline Actions Meta comment on the DEBUG statements. I like them, but it might be nice to give a summary of what you're analyzing here before all of the "can't do it" statements. echristo: Meta comment on the DEBUG statements. I like them, but it might be nice to give a summary of…
				leiAuthorUnsubmitted Done Reply Inline Actions okay lei: okay
				// FalseMBBim is null)
				if (!Cand.BranchTargetBlock \|\| (Cand.BranchTargetBlock && FalseMBB)) {
				DEBUG(dbgs() << "Does not form a triangle - skip\n");
				return false;
				}

				if (Cand.BranchTargetBlock == Cand.BranchBlock) {
				DEBUG(dbgs() << "Branch to the same block - skip\n");
				return false;
				}

				// Only consider simple control flow for now. In other words, only try to
				// coalesce the branch-taken block (i.e., BranchTargetBlock) if it
				// post-dominates the current block
				if (!MPDT->dominates(Cand.BranchTargetBlock, Cand.BranchBlock)) {
				DEBUG(dbgs() << "Complex control flow - skip\n");
				return false;
				}

				if (Cand.BranchBlock->isEHPad() \|\| Cand.BranchBlock->hasEHPadSuccessor()) {
				DEBUG(dbgs() << "EH Pad - skip\n");
				return false;
				}

				// Ensure there are only two successors
				if (Cand.BranchBlock->succ_size() != 2) {
				DEBUG(dbgs() << "Does not have 2 successors - skip\n");
				return false;
				}

				// Sanity check - the block must be able to fall through
				assert(Cand.BranchBlock->canFallThrough() &&
				"Expecting the block to fall through!");

				// Record the fall through block
				for (MachineBasicBlock *Succ : Cand.BranchBlock->successors())
				if (Succ != Cand.BranchTargetBlock) {
				assert(Succ && "Expecting a valid fall-through block\n");

				if (!Succ->empty()) {
				DEBUG(dbgs() << "Fall-through block contains code -- skip\n");
				return false;
				}

				if (!Succ->isSuccessor(Cand.BranchTargetBlock)) {
				DEBUG(dbgs()
				<< "Successor of fall through block is not branch taken block\n");
				return false;
				}
				Cand.FallThroughBlock = Succ;
				}
				echristoUnsubmitted Done Reply Inline Actions I believe the case you're looking for here is also whether or not there's a single fall through? Might be nice to reorganize the code with that in mind. echristo: I believe the case you're looking for here is also whether or not there's a single fall through?
				leiAuthorUnsubmitted Not Done Reply Inline Actions okay. lei: okay.
				leiAuthorUnsubmitted Not Done Reply Inline Actions Will reorganize this. lei: Will reorganize this.

				DEBUG(dbgs() << "Valid Candidate\n");
				return true;
				}

				///
				/// Determine if the two operand lists are identical
				///
				/// \param[in] OpList1 operand list
				/// \param[in] OpList2 operand list
				/// \return true if and only if the operands lists are identical
				///
				bool BranchCoalescing::identicalOperands(
				ArrayRef<MachineOperand> OpList1, ArrayRef<MachineOperand> OpList2) const {

				if (OpList1.size() != OpList2.size()) {
				DEBUG(dbgs() << "Operand list is different size\n");
				return false;
				}

				for (unsigned i = 0; i < OpList1.size(); ++i) {
				const MachineOperand &Op1 = OpList1[i];
				const MachineOperand &Op2 = OpList2[i];

				DEBUG(dbgs() << "Op1: " << Op1 << "\n"
				<< "Op2: " << Op2 << "\n");

				if (Op1.isIdenticalTo(Op2)) {
				DEBUG(dbgs() << "Op1 and Op2 are identical!\n");
				continue;
				}

				// If the operands are not identical, but are registers, check to see if the
				nemanjaiUnsubmitted Done Reply Inline Actions Use same debug format for both (i.e. space after the colon). nemanjai: Use same debug format for both (i.e. space after the colon).
				leiAuthorUnsubmitted Done Reply Inline Actions fixed lei: fixed
				// definition of the register produces the same value. If they produce the
				// same value, consider them to be identical.
				if (Op1.isReg() && Op2.isReg() &&
				TargetRegisterInfo::isVirtualRegister(Op1.getReg()) &&
				TargetRegisterInfo::isVirtualRegister(Op2.getReg())) {
				MachineInstr *Op1Def = MRI->getVRegDef(Op1.getReg());
				MachineInstr *Op2Def = MRI->getVRegDef(Op2.getReg());
				if (TII->produceSameValue(Op1Def, Op2Def, MRI)) {
				nemanjaiUnsubmitted Done Reply Inline Actions Won't this fail (as in assert) if `Op2.isReg() != true`? nemanjai: Won't this fail (as in assert) if `Op2.isReg() != true`?
				leiAuthorUnsubmitted Done Reply Inline Actions No, since we check that Op2 and Op1 are the same type on line 330. We do an early exit if they are not the same type. getType() returns an enum MachineOperandType, isReg() basically checks that the MachineOperand's type (MachineOperandType) is of the specific enum MO_Register. lei: No, since we check that Op2 and Op1 are the same type on line 330. We do an early exit if they…
				nemanjaiUnsubmitted Done Reply Inline Actions Oh OK. I missed the early exit check. I think the layout of this function is a little confusing so I can see why I missed it. It seems that the conditions under which we return true are (any of): The lists are pair-wise identical For any pairs that aren't identical, they're both registers that produce the same value Why not such a simple layout to the function? For example, your check for same type is repeated in the call to `MachineOperand::isIdenticalTo()`. It seems to me like a layout such as this for the loop encapsulates what this function should do: for (unsigned i = 0; i < OpList1.size(); ++i) { const MachineOperand &Op1 = OpList1[i]; const MachineOperand &Op2 = OpList2[i]; DEBUG(dbgs() << "Op1: " << Op1 << "\n" << "Op2: " << Op2 << "\n"); if (Op1.isIdenticalTo(Op2)) { DEBUG(dbgs() << "Op1 and Op2 are identical!\n"); return true; } if (Op1.isReg() && Op2.isReg() && TargetRegisterInfo::isVirtualRegister(Op1.getReg()) && TargetRegisterInfo::isVirtualRegister(Op2.getReg())) { if (TII->produceSameValue(Op1Def, Op2Def, MRI)) { DEBUG(dbgs() << "Operands produce the same value.\n"); return true; } } DEBUG(dbgs() << "The operands are not provably identical.\n"); return false; } nemanjai: Oh OK. I missed the early exit check. I think the layout of this function is a little confusing…
				leiAuthorUnsubmitted Done Reply Inline Actions Minor change so that we are doing early exit when operands within the operand lists are detected to be diff. for (unsigned i = 0; i < OpList1.size(); ++i) { const MachineOperand &Op1 = OpList1[i]; const MachineOperand &Op2 = OpList2[i]; DEBUG(dbgs() << "Op1: " << Op1 << "\n" << "Op2: " << Op2 << "\n"); if (Op1.isIdenticalTo(Op2)) { DEBUG(dbgs() << "Op1 and Op2 are identical!\n"); continue; } // If the operands are not identical, but are registers, check to see if the // definition of the register produces the same value. If they produce the // same value, consider them to be identical. if (Op1.isReg() && Op2.isReg() && TargetRegisterInfo::isVirtualRegister(Op1.getReg()) && TargetRegisterInfo::isVirtualRegister(Op2.getReg())) { MachineInstr Op1Def = MRI->getVRegDef(Op1.getReg()); MachineInstr Op2Def = MRI->getVRegDef(Op2.getReg()); if (TII->produceSameValue(Op1Def, Op2Def, MRI)) { DEBUG(dbgs() << "Op1Def: " << Op1Def << " and " << Op2Def << " produce the same value!\n"); } else { DEBUG(dbgs() << "Operands produce different values\n"); return false; } } else { DEBUG(dbgs() << "The operands are not provably identical.\n"); return false; } } lei: Minor change so that we are doing early exit when operands within the operand lists are…
				nemanjaiUnsubmitted Done Reply Inline Actions Sorry, the two `return true;` lines should be `continue;` so that you keep checking the remainder of the lists rather than returning true if the first pair are identical. nemanjai: Sorry, the two `return true;` lines should be `continue;` so that you keep checking the…
				leiAuthorUnsubmitted Not Done Reply Inline Actions see code snippet above. lei: see code snippet above.
				DEBUG(dbgs() << "Op1Def: " << Op1Def << " and " << Op2Def
				<< " produce the same value!\n");
				} else {
				DEBUG(dbgs() << "Operands produce different values\n");
				return false;
				}
				} else {
				DEBUG(dbgs() << "The operands are not provably identical.\n");
				echristoUnsubmitted Done Reply Inline Actions "beginning" echristo: "beginning"
				return false;
				}
				}
				return true;
				nemanjaiUnsubmitted Done Reply Inline Actions Maybe a comment as to why we fail here if we get physical registers. Also, if I follow the control flow here correctly, we will return `true` here if one is a physical register and the other is a virtual register. Is that what we want? I imagine not. nemanjai: Maybe a comment as to why we fail here if we get physical registers. Also, if I follow the…
				leiAuthorUnsubmitted Done Reply Inline Actions It was just put in to see if we trip on it. I have removed this. lei: It was just put in to see if we trip on it. I have removed this.
				nemanjaiUnsubmitted Done Reply Inline Actions I'm afraid this won't suffice. If it were to somehow happen that (at least) one is a physical register, you'll return true here - regardless of whether it's the same register (not that it really matters whether it's the same register or not). nemanjai: I'm afraid this won't suffice. If it were to somehow happen that (at least) one is a physical…
				leiAuthorUnsubmitted Done Reply Inline Actions changes in previous comment will address this issue as well. lei: changes in previous comment will address this issue as well.
				}

				///
				/// Moves ALL PHI instructions in SourceMBB into TargetMBB and update them to
				/// refer to the new block. PHI instructions in SourceMBB are placed at the
				/// beginning of TargetMBB, before existing PHI instructions.
				///
				echristoUnsubmitted Done Reply Inline Actions Could use an explanation of why - it confused Nemanja at first and could confuse others. echristo: Could use an explanation of why - it confused Nemanja at first and could confuse others.
				leiAuthorUnsubmitted Done Reply Inline Actions Sine PHI node ordering can not be assumed, it doesn't really matter where we place the PHI instructions. Will update comment to reflect this. lei: Sine PHI node ordering can not be assumed, it doesn't really matter where we place the PHI…
				/// \param[in] SourceMBB block to move PHI instructions from
				/// \param[in] TargetMBB block to move PHI instructions to
				///
				void BranchCoalescing::moveAndUpdatePHIs(MachineBasicBlock *SourceMBB,
				echristoUnsubmitted Done Reply Inline Actions "contains" echristo: "contains"
				MachineBasicBlock *TargetMBB) {

				MachineBasicBlock::iterator MI = SourceMBB->begin();
				MachineBasicBlock::iterator ME = SourceMBB->getFirstNonPHI();

				if (MI == ME) {
				DEBUG(dbgs() << "SourceMBB contain no PHI instructions.\n");
				return;
				}

				// Always to move to top of TargetMBB
				nemanjaiUnsubmitted Done Reply Inline Actions Not that it makes a difference computationally, but perhaps for consistency: for (MachineBasicBlock::instr_iterator Instr : MBB.instrs()) if (Instr->isPHI()) return Instr; return MBB.instr_end(); Also, I'm not sure what the style guide says regarding the use of `auto` there, but perhaps instead of `MachineBasicBlock::instr_iterator`, it should be `auto`. Finally, if every use of this function will just use `begin()` instead of `end()` when there are no existing PHI's, perhaps it would make sense to actually return `begin()` if there are no PHI's in the MBB. nemanjai: Not that it makes a difference computationally, but perhaps for consistency: ``` for…
				leiAuthorUnsubmitted Done Reply Inline Actions Can begin() also be a PHI? If not, then that is okay, if it can then it is better to return end() so that we know there are no PHIs in this block. lei: Can begin() also be a PHI? If not, then that is okay, if it can then it is better to return…
				nemanjaiUnsubmitted Done Reply Inline Actions Are you referring to some potential future use of this function? Because this result isn't currently used to determine whether the block has any PHI's. But I suppose you're right it is potentially dangerous for it to return a valid iterator when there are no PHI's. Also, the work this function does just seems like overkill. It seems like you should only be checking instructions between `MBB.instr_begin()` and `MBB.getFirstNonPHI()`. Why does that not suffice? In any case, considering this function is only used once and in many cases it'll return a value that you're just going to discard and use the `begin()` pointer, I don't think this function should exist. Just find the insert location in the function where you need it. nemanjai: Are you referring to some potential future use of this function? Because this result isn't…
				leiAuthorUnsubmitted Not Done Reply Inline Actions will remove function and add loop to location where it's being used. lei: will remove function and add loop to location where it's being used.
				leiAuthorUnsubmitted Not Done Reply Inline Actions Actually according to the llvm lang ref, PHI instructions must be first in a basic block. Will remove this function and replace the function call below with MBB->begin() lei: Actually according to the llvm lang ref, PHI instructions must be first in a basic block. Will…
				echristoUnsubmitted Done Reply Inline Actions I don't think this has ever been changed, but I'm not against a separate function. That said, doesn't transferSuccessorsAndUpdatePHIs already do this? Also, this loop needs documentation :) echristo: I don't think this has ever been changed, but I'm not against a separate function. That said…
				leiAuthorUnsubmitted Not Done Reply Inline Actions It's because here we only want to move the PHIs from SourceMBB to TargetMBB. SourceMBB will be deleted later. We don't want to transfer the successor info here. Control flow need to be updated once we have deleted SourceMBB. lei: It's because here we only want to move the PHIs from SourceMBB to TargetMBB. SourceMBB will be…
				MachineBasicBlock::iterator InsertLoc = TargetMBB->begin();
				for (MachineBasicBlock::iterator Iter = MI; Iter != ME; Iter++) {
				MachineInstr &PHIInst = *Iter;
				for (unsigned i = 2, e = PHIInst.getNumOperands() + 1; i != e; i += 2) {
				MachineOperand &MO = PHIInst.getOperand(i);
				if (MO.getMBB() == SourceMBB)
				MO.setMBB(TargetMBB);
				}
				nemanjaiUnsubmitted Done Reply Inline Actions It seems below like you're inserting them before any existing PHI instructions. nemanjai: It seems below like you're inserting them before any existing PHI instructions.
				leiAuthorUnsubmitted Done Reply Inline Actions Yes. They should be inserted before existing PHIs. Will update the doc. lei: Yes. They should be inserted before existing PHIs. Will update the doc.
				}
				// Move all PHI instructions in SourceMBB to TargetMBB
				TargetMBB->splice(InsertLoc, SourceMBB, MI, ME);
				}

				///
				/// Determine if the specified instruction can be moved to the TargetMBB.
				nemanjaiUnsubmitted Not Done Reply Inline Actions Can you please state why something like `MachineBasicBlock::SkipPHIsLabelsAndDebug()` cannot be used instead of this function (thereby eliminating the need for that function altogether)? nemanjai: Can you please state why something like `MachineBasicBlock::SkipPHIsLabelsAndDebug()` cannot be…
				leiAuthorUnsubmitted Not Done Reply Inline Actions MachineBasicBlock::SkipPHIsLabelsAndDebug() returns the iterator to the first NON PHI, non-label instruction. We want the iterator to the first PHI node in a MBB. I don't see any function within that class that returns the iterator to the first PHI. lei: MachineBasicBlock::SkipPHIsLabelsAndDebug() returns the iterator to the first NON PHI, non…
				nemanjaiUnsubmitted Not Done Reply Inline Actions OK, that makes sense. I suppose you want to make sure that you skip anything that comes between `begin()` and the first PHI node in the `ToMBB`. But I guess what I'm confused about is exactly what this function is meant to do. It appears that it will do the following: Iterate over all the PHI nodes in MBB called `FromMBB` If there are any PHI nodes that refer to the same MBB, they'll be updated so they refer to the MBB called `ToMBB` Then each of those PHI nodes will be moved to MBB called `ToMBB` before any existing PHI nodes in `ToMBB` However, what happens to any PHI nodes in `ToMBB` that refer to `FromMBB`? They don't need to be updated? nemanjai: OK, that makes sense. I suppose you want to make sure that you skip anything that comes between…
				leiAuthorUnsubmitted Not Done Reply Inline Actions This function moves PHI nodes in the FromMBB to the beginning of the PHI block in ToMBB. If any PHI nodes in ToMBB reference registers defined in PHI nodes in FromMBB, the PHI nodes in FromMBB can not be moved to ToMBB since we can not infer order in the PHI node execution. lei: This function moves PHI nodes in the FromMBB to the beginning of the PHI block in ToMBB. If…
				echristoUnsubmitted Done Reply Inline Actions Should probably assert this somewhere. echristo: Should probably assert this somewhere.
				leiAuthorUnsubmitted Not Done Reply Inline Actions We don't assert cause we first check one direction (canMoveToBeginning) and then check the other direction (canMoveToEnd). If both fails, we just don't move the instruction and move to investigate the next one. lei: We don't assert cause we first check one direction (canMoveToBeginning) and then check the…
				echristoUnsubmitted Done Reply Inline Actions This seems like an odd comment. echristo: This seems like an odd comment.
				leiAuthorUnsubmitted Not Done Reply Inline Actions A MI instruction can only be moved to TargetMBB if there are no uses of it within the TargetMBB's PHI nodes. Will reword the comment. lei: A MI instruction can only be moved to TargetMBB if there are no uses of it within the…
				/// If MoveToBeginning is set to true, function checks if MI can be moved to
				/// the begining of the TargetMBB following PHI instructions.
				/// If MoveToBeginning is set to false, checks if MI can be moved to the end
				/// of the TargetMBB, immediately before the first terminator.
				///
				/// An MI instruction can be moved to beginning of the TargetMBB if there are no
				/// PHI's in the TargetMBB that use what MI defines.
				///
				/// An MI instruction can be moved to then end of the TargetMBB if no PHI node
				/// defines what MI uses within it's own MBB.
				///
				echristoUnsubmitted Done Reply Inline Actions I think it makes just as much sense (and is more readable) if you split the function into canMoveToBeginning and canMoveToEnd and just drop the boolean parameter. It'll also make it easier to split the docs. echristo: I think it makes just as much sense (and is more readable) if you split the function into…
				leiAuthorUnsubmitted Not Done Reply Inline Actions agreed. lei: agreed.
				/// \param[in] MI the machine instruction to move.
				/// \param[in] MBB the machine basic block to move to
				/// \param[in] MoveToBeginning true indicates move to the beginning of MBB,
				/// false indicates move to end of MBB.
				/// \return true if it is safe to move MI to MBB, false otherwise
				nemanjaiUnsubmitted Not Done Reply Inline Actions You should have a range of iterators that represent the PHI's in the `From` block. Why not use the range-version of `splice()` after this loop? If it is to avoid the empty range check on the PHI's in the `From` block, maybe just a comment to that end. nemanjai: You should have a range of iterators that represent the PHI's in the `From` block. Why not use…
				leiAuthorUnsubmitted Done Reply Inline Actions will update to use range version of splice() lei: will update to use range version of splice()
				///
				bool BranchCoalescing::canMoveTo(const MachineInstr &MI,
				const MachineBasicBlock &TargetMBB,
				bool MoveToBeginning) const {

				if (MoveToBeginning) {
				nemanjaiUnsubmitted Done Reply Inline Actions The second sentence is redundant. nemanjai: The second sentence is redundant.
				leiAuthorUnsubmitted Done Reply Inline Actions agreed lei: agreed
				DEBUG(dbgs() << "Checking if " << MI << " can move to beginning of "
				<< TargetMBB.getNumber() << "\n");
				for (auto &Def : MI.defs()) { // Looking at Def
				for (auto &Use : MRI->use_instructions(Def.getReg())) {
				nemanjaiUnsubmitted Done Reply Inline Actions '... used in this block ...' means '... used in original block ...'? Overall, I think this comment is kind of hard to follow. I think it would suffice to say when an instruction can move to the begin/end location. Something along the lines of: An instruction MI can move from MBB Source to MBB Target under the following conditions: There are no PHI's in Target that use what MI defines No instructions in Source use what MI defines (unless Target is a dominator of Source) No instructions in Source define what MI uses (unless Source is a dominator of Target) If any instructions in Target define what MI uses, the MI can only move to the end of Target nemanjai: '... used in this block ...' means '... used in original block ...'? Overall, I think this…
				leiAuthorUnsubmitted Done Reply Inline Actions Arguments renamed and doc updated to be more clear. lei: Arguments renamed and doc updated to be more clear.
				if (Use.isPHI() && Use.getParent() == &TargetMBB) {
				DEBUG(dbgs() << " * used in a PHI -- cannot move *\n");
				return false;
				}
				}
				}
				} else {
				DEBUG(dbgs() << "Checking if " << MI << " can move to end of "
				<< TargetMBB.getNumber() << "\n");
				for (auto &Use : MI.uses()) {
				if (Use.isReg() && TargetRegisterInfo::isVirtualRegister(Use.getReg())) {
				MachineInstr *DefInst = MRI->getVRegDef(Use.getReg());
				if (DefInst->isPHI() && DefInst->getParent() == MI.getParent()) {
				DEBUG(dbgs() << " * Cannot move this instruction *\n");
				return false;
				} else {
				DEBUG(dbgs() << " *** def is in another block -- safe to move!\n");
				}
				}
				}
				}

				DEBUG(dbgs() << " Safe to move\n");
				nemanjaiUnsubmitted Not Done Reply Inline Actions I don't understand the purpose of the comment. nemanjai: I don't understand the purpose of the comment.
				leiAuthorUnsubmitted Not Done Reply Inline Actions will remove this redundant check since MI.defs() return register definitions. lei: will remove this redundant check since MI.defs() return register definitions.
				return true;
				}

				///
				/// This method checks to ensure the two coalescing candidates follows the
				/// expected pattern required for coalescing.
				///
				/// \param[in] SourceRegion The candidate to move statements from
				/// \param[in] TargetRegion The candidate to move statements to
				/// \return true if all instructions in SourceRegion.BranchBlock can be merged
				/// into a block in TargetRegion; false otherwise.
				///
				bool BranchCoalescing::validateCandidates(
				CoalescingCandidateInfo &SourceRegion,
				CoalescingCandidateInfo &TargetRegion) const {
				std::string err_msg;

				if (TargetRegion.BranchTargetBlock != SourceRegion.BranchBlock)
				err_msg = "Expecting SourceRegion to immediately follow TargetRegion";
				else if (!MDT->dominates(TargetRegion.BranchBlock, SourceRegion.BranchBlock))
				err_msg = "Expecting TargetRegion to dominate SourceRegion";
				else if (!MPDT->dominates(SourceRegion.BranchBlock, TargetRegion.BranchBlock))
				err_msg = "Expecting SourceRegion to post-dominate TargetRegion";
				else if (!TargetRegion.FallThroughBlock->empty() \|\|
				nemanjaiUnsubmitted Not Done Reply Inline Actions I don't think this function suffices to determine whether an instruction can move to a block. Example MBB's: BB#0: ... ; some non-PHI def of %vreg2 ; some non-PHI def of %vreg3 %vreg10 = ADD %vreg2, %vreg3 ; MI ... BB#1: ; MBB1 %vreg5 = PHI %vreg10, <BB#0>, %vreg4, <BB#2> ... The call: `canMoveTo(MI, MBB1, false)` Would return true, but it is not safe to do so. I think this patch uses this query in very limited circumstances, so it isn't necessarily required for this function to determine whether it is safe to move an arbitrary instruction to an arbitrary block. But if this is the case, we should add asserts to eliminate incorrect uses of the function (in the future). nemanjai: I don't think this function suffices to determine whether an instruction can move to a block.
				leiAuthorUnsubmitted Not Done Reply Inline Actions You are right, this function alone can not be used to determine whether an instruction can move to a random block. This function is a helper function used by canMerge() to determine whether 2 candidates can be merged. lei: You are right, this function alone can not be used to determine whether an instruction can move…
				nemanjaiUnsubmitted Not Done Reply Inline Actions Yes, but please add asserts to ensure this function isn't used incorrectly in the future. It is dangerous for a function called `canMoveTo` to return true when an instruction can't safely move to a basic block. nemanjai: Yes, but please add asserts to ensure this function isn't used incorrectly in the future. It is…
				leiAuthorUnsubmitted Not Done Reply Inline Actions We always return false when an instruction isn't safe to remove. This function just does a check to see if it's valid to move an instruction to the BB at the specified location. If won't work if it asserts when it can't be moved. lei: We always return false when an instruction isn't safe to remove. This function just does a…
				echristoUnsubmitted Done Reply Inline Actions Possible you just want to replace this with a bunch of llvm_unreachables? echristo: Possible you just want to replace this with a bunch of llvm_unreachables?
				leiAuthorUnsubmitted Not Done Reply Inline Actions yup lei: yup
				!SourceRegion.FallThroughBlock->empty())
				err_msg = "Expecting fall-through blocks to be empty";

				bool verify = err_msg.empty();
				DEBUG(dbgs() << err_msg << "\n");

				assert(verify && "Invalid candidates for branch coalescing!");

				return (verify);
				}

				///
				/// This method determines whether the two coalescing candidates can be merged.
				echristoUnsubmitted Not Done Reply Inline Actions Why? echristo: Why?
				leiAuthorUnsubmitted Not Done Reply Inline Actions Maybe preference is the wrong word here... this is by design. Will update the doc lei: Maybe preference is the wrong word here... this is by design. Will update the doc
				/// In order to be merged, all instructions must be able to
				/// 1. Move to the beginning of the SourceRegion.BranchTargetBlock;
				/// 2. Move to the end of the TargetRegion.BranchBlock.
				/// Merging involves moving the instructions in the
				/// TargetRegion.BranchTargetBlock (also SourceRegion.BranchBlock).
				///
				/// The preference is to move instructions down, to the
				/// beginning of the SourceRegion.BranchTargetBlock. This is not possible if any
				echristoUnsubmitted Done Reply Inline Actions Sounds like an assert. echristo: Sounds like an assert.
				leiAuthorUnsubmitted Not Done Reply Inline Actions we do assert on that, will update the doc here. lei: we do assert on that, will update the doc here.
				/// register defined in SourceRegion.BranchBlock is used in a PHI node in the
				/// SourceRegion.BranchTargetBlock. In this case, check whether the statement
				/// can be moved up, to the end of the TargetRegion.BranchBlock (immediately
				/// before the branch statement). If it cannot move, then these blocks cannot
				/// be merged.
				///
				/// Note that there is no analysis for moving instructions past the fall-through
				/// blocks because they are assumed to be empty. If they are not empty, then
				/// additional safety analysis must be added here to ensure it is safe to move
				/// the instructions in SourceRegion.BranchBlock past the fall-through blocks.
				///
				/// \param[in] SourceRegion The candidate to move statements from
				/// \param[in] TargetRegion The candidate to move statements to
				/// \return true if all instructions in SourceRegion.BranchBlock can be merged
				/// into a block in TargetRegion; false otherwise.
				///
				bool BranchCoalescing::canMerge(CoalescingCandidateInfo &SourceRegion,
				CoalescingCandidateInfo &TargetRegion) const {
				if (!validateCandidates(SourceRegion, TargetRegion))
				return false;
				nemanjaiUnsubmitted Not Done Reply Inline Actions Why are all of these asserts rather than early exits returning false? On builds that compile asserts out, I imagine we could end up doing the wrong thing. nemanjai: Why are all of these asserts rather than early exits returning false? On builds that compile…
				leiAuthorUnsubmitted Not Done Reply Inline Actions This seem to be the case for other classes like MachineOperand. If assert is off it could end up doing the wrong thing.... lei: This seem to be the case for other classes like MachineOperand. If assert is off it could end…

				// Walk through PHI nodes first and see if they force the merge into the
				// SourceRegion.BranchTargetBlock.
				nemanjaiUnsubmitted Done Reply Inline Actions I don't think there's anything to be gained from printing the name of the function emitting the message in the debug output. Please remove this. nemanjai: I don't think there's anything to be gained from printing the name of the function emitting the…
				leiAuthorUnsubmitted Done Reply Inline Actions okay lei: okay
				for (MachineBasicBlock::iterator
				I = SourceRegion.BranchBlock->instr_begin(),
				E = SourceRegion.BranchBlock->getFirstNonPHI();
				I != E; ++I) {
				for (auto &Def : I->defs())
				for (auto &Use : MRI->use_instructions(Def.getReg())) {
				echristoUnsubmitted Done Reply Inline Actions Seems like you want to swap these conditionals? echristo: Seems like you want to swap these conditionals?
				if (Use.getParent() == SourceRegion.BranchBlock) {
				DEBUG(dbgs() << "PHI " << *I
				<< " defines register used in this "
				"block -- all must move down\n");
				SourceRegion.MustMoveDown = true;
				}
				if (Use.isPHI() && Use.getParent() == SourceRegion.BranchTargetBlock) {
				DEBUG(dbgs() << "PHI " << *I << " defines register used in another "
				"PHI within branch target block -- can't merge\n");
				NumPHINotMoved++;
				return false;
				}
				}
				}

				for (MachineBasicBlock::iterator
				I = SourceRegion.BranchBlock->getFirstNonPHI(),
				E = SourceRegion.BranchBlock->end();
				echristoUnsubmitted Done Reply Inline Actions else if? Otherwise what happens? echristo: else if? Otherwise what happens?
				leiAuthorUnsubmitted Not Done Reply Inline Actions No else if. We check conditions to move up and down separately and verify that only one direction is valid. lei: No else if. We check conditions to move up and down separately and verify that only one…
				I != E; ++I) {
				if (!canMoveTo(I, SourceRegion.BranchTargetBlock, true)) {
				DEBUG(dbgs() << "Instruction " << *I
				<< " cannot move down - must move up!\n");
				SourceRegion.MustMoveUp = true;
				}
				if (!canMoveTo(I, TargetRegion.BranchBlock, false)) {
				DEBUG(dbgs() << "Instruction " << *I
				<< " cannot move up - must move down!\n");
				SourceRegion.MustMoveDown = true;
				}
				}

				return (SourceRegion.MustMoveUp && SourceRegion.MustMoveDown) ? false : true;
				}

				/// Merge the instructions from SourceRegion.BranchBlock,
				/// SourceRegion.BranchTargetBlock, and SourceRegion.FallThroughBlock into
				/// TargetRegion.BranchBlock, TargetRegion.BranchTargetBlock and
				/// TargetRegion.FallThroughBlock respectively.
				nemanjaiUnsubmitted Done Reply Inline Actions The use of 'From', 'from', 'To' and 'to' makes the comment confusing. Please consider renaming the parameters or otherwise clarifying this comment. Perhaps something like: `Merge the CFG triangle \p From into the preceding CFG triangle \p To.` nemanjai: The use of 'From', 'from', 'To' and 'to' makes the comment confusing. Please consider renaming…
				leiAuthorUnsubmitted Done Reply Inline Actions Renamed "From/To" to "SourceRegion/TargetRegion". I believe this is now less confusing ... lei: Renamed "From/To" to "SourceRegion/TargetRegion". I believe this is now less confusing ...
				///
				/// The successors for blocks in TargetRegion will be updated to use the
				/// successors from blocks in SourceRegion. Finally, the blocks in SourceRegion
				/// will be removed from the function.
				///
				/// A region consists of a BranchBlock, a FallThroughBlock, and a
				/// BranchTargetBlock. Branch coalesce works on patterns where the
				/// TargetRegion's BranchTargetBlock must also be the SourceRegions's
				/// BranchBlock.
				///
				/// Before mergeCandidates:
				///
				/// +---------------------------+
				/// \| TargetRegion.BranchBlock \|
				/// +---------------------------+
				/// / \|
				/// / +--------------------------------+
				/// \| \| TargetRegion.FallThroughBlock \|
				/// \ +--------------------------------+
				/// \ \|
				/// +----------------------------------+
				/// \| TargetRegion.BranchTargetBlock \|
				/// \| SourceRegion.BranchBlock \|
				/// +----------------------------------+
				/// / \|
				/// / +--------------------------------+
				/// \| \| SourceRegion.FallThroughBlock \|
				/// \ +--------------------------------+
				/// \ \|
				/// +----------------------------------+
				/// \| SourceRegion.BranchTargetBlock \|
				/// +----------------------------------+
				///
				/// After mergeCandidates:
				///
				/// +-----------------------------+
				/// \| TargetRegion.BranchBlock \|
				/// \| SourceRegion.BranchBlock \|
				/// +-----------------------------+
				/// / \|
				/// / +---------------------------------+
				/// \| \| TargetRegion.FallThroughBlock \|
				/// \| \| SourceRegion.FallThroughBlock \|
				/// \ +---------------------------------+
				/// \ \|
				/// +----------------------------------+
				/// \| SourceRegion.BranchTargetBlock \|
				/// +----------------------------------+
				///
				/// \param[in] SourceRegion The candidate to move blocks from
				/// \param[in] TargetRegion The candidate to move blocks to
				///
				bool BranchCoalescing::mergeCandidates(CoalescingCandidateInfo &SourceRegion,
				CoalescingCandidateInfo &TargetRegion) {
				nemanjaiUnsubmitted Done Reply Inline Actions Are these two asserts enough or does `From.BranchBlock` need to [immediately] dominate both `To.BranchTargetBlock` and `To.FallThroughBlock`? nemanjai: Are these two asserts enough or does `From.BranchBlock` need to [immediately] dominate both `To.
				leiAuthorUnsubmitted Not Done Reply Inline Actions From.BranchBlock MUST equal To.BranchTargetBlock, not dominate. lei: From.BranchBlock MUST equal To.BranchTargetBlock, not dominate.
				nemanjaiUnsubmitted Not Done Reply Inline Actions The two are not mutually exclusive. Also, this doesn't answer whether `From.BranchBlock` needs to dominate `To.FallThroughBlock`. nemanjai: The two are not mutually exclusive. Also, this doesn't answer whether `From.BranchBlock` needs…
				leiAuthorUnsubmitted Done Reply Inline Actions The same kind of checks are needed for function canMerge and mergeCandidate, Put the checks into function validateCandidates() and call it to verify MBBs and do an early exit if returns false. lei: The same kind of checks are needed for function canMerge and mergeCandidate, Put the checks…

				if (SourceRegion.MustMoveUp && SourceRegion.MustMoveDown) {
				assert(0 && "Cannot have both MustMoveDown and MustMoveUp set!");
				DEBUG(dbgs() << "Cannot have both MustMoveDown and MustMoveUp set!");
				echristoUnsubmitted Done Reply Inline Actions llvm_unreachable echristo: llvm_unreachable
				leiAuthorUnsubmitted Done Reply Inline Actions okay lei: okay
				return false;
				}

				if (!validateCandidates(SourceRegion, TargetRegion))
				return false;

				// Handle the BranchBlock first
				// Move any PHIs in SourceRegion.BranchBlock down to the branch-taken block
				nemanjaiUnsubmitted Done Reply Inline Actions You've already moved all the PHI's from `From.BranchBlock`. Can't this just be a loop over `From.BranchBlock->instrs()`? Or it may need to be a loop from `rbegin()` to `rend()`. nemanjai: You've already moved all the PHI's from `From.BranchBlock`. Can't this just be a loop over…
				leiAuthorUnsubmitted Done Reply Inline Actions agreed lei: agreed
				leiAuthorUnsubmitted Done Reply Inline Actions changing this to be a range based splice call instead. lei: changing this to be a range based splice call instead.
				moveAndUpdatePHIs(SourceRegion.BranchBlock, SourceRegion.BranchTargetBlock);

				// Move remaining instructions in SourceRegion.BranchBlock into
				// TargetRegion.BranchBlock
				MachineBasicBlock::iterator firstInstr =
				SourceRegion.BranchBlock->getFirstNonPHI();
				nemanjaiUnsubmitted Done Reply Inline Actions This is loop-invariant. I think it'd be cleaner to move the assert out of the loop and call `splice()` on the result of a ternary operator. Maybe even invert the `if` and the loop. nemanjai: This is loop-invariant. I think it'd be cleaner to move the assert out of the loop and call…
				leiAuthorUnsubmitted Done Reply Inline Actions assert moved to top of function lei: assert moved to top of function
				MachineBasicBlock::iterator lastInstr =
				SourceRegion.BranchBlock->getFirstTerminator();

				MachineBasicBlock *Source = SourceRegion.MustMoveDown
				? SourceRegion.BranchTargetBlock
				: TargetRegion.BranchBlock;

				MachineBasicBlock::iterator Target =
				SourceRegion.MustMoveDown
				? SourceRegion.BranchTargetBlock->getFirstNonPHI()
				: TargetRegion.BranchBlock->getFirstTerminator();

				Source->splice(Target, SourceRegion.BranchBlock, firstInstr, lastInstr);

				// Clean-up the control flow
				// Remove SourceRegion.FallThroughBlock before transferring successors of
				echristoUnsubmitted Done Reply Inline Actions Nit: All comments should be complete sentences. (A few other occurrences) echristo: Nit: All comments should be complete sentences. (A few other occurrences)
				leiAuthorUnsubmitted Done Reply Inline Actions k. lei: k.
				// SourceRegion.BranchBlock to TargetRegion.BranchBlock.
				SourceRegion.BranchBlock->removeSuccessor(SourceRegion.FallThroughBlock);
				TargetRegion.BranchBlock->transferSuccessorsAndUpdatePHIs(
				SourceRegion.BranchBlock);
				// Update branch in TargetRegion.BranchBlock to jump to
				// SourceRegion.BranchTargetBlock
				// In this case, TargetRegion.BranchTargetBlock == SourceRegion.BranchBlock.
				TargetRegion.BranchBlock->ReplaceUsesOfBlockWith(
				SourceRegion.BranchBlock, SourceRegion.BranchTargetBlock);
				// Remove the branch statement(s) in SourceRegion.BranchBlock
				MachineBasicBlock::iterator I =
				SourceRegion.BranchBlock->terminators().begin();
				while (I != SourceRegion.BranchBlock->terminators().end()) {
				nemanjaiUnsubmitted Done Reply Inline Actions Where is the assert to ensure this? nemanjai: Where is the assert to ensure this?
				leiAuthorUnsubmitted Not Done Reply Inline Actions In analyzeBranch(), line 297, we do an early exit if the FallThroughBlock contain code. lei: In analyzeBranch(), line 297, we do an early exit if the FallThroughBlock contain code.
				nemanjaiUnsubmitted Done Reply Inline Actions Sure. Now there is. I'd still prefer an assert. If someone in the future feels that `analyzeBranch()` doesn't need that early exit any longer, they'll need to ensure they account for this assert and why it's in this function. nemanjai: Sure. Now there is. I'd still prefer an assert. If someone in the future feels that…
				leiAuthorUnsubmitted Done Reply Inline Actions okay lei: okay
				MachineInstr &CurrInst = *I;
				++I;
				if (CurrInst.isBranch())
				CurrInst.eraseFromParent();
				}

				// Merge FallThroughBlock
				// Move any PHIs down to the branch-taken block

				// Not necessary to merge the fall-through blocks, they should be empty!
				assert(TargetRegion.FallThroughBlock->empty() &&
				echristoUnsubmitted Done Reply Inline Actions This set of comments is very confusing. echristo: This set of comments is very confusing.
				leiAuthorUnsubmitted Done Reply Inline Actions okay lei: okay
				"FallThroughBlocks should be empty!");

				// We still need to transfer the successors though, and update the CFG
				TargetRegion.FallThroughBlock->transferSuccessorsAndUpdatePHIs(
				SourceRegion.FallThroughBlock);
				TargetRegion.FallThroughBlock->removeSuccessor(SourceRegion.BranchBlock);

				// Remove the blocks from the function.
				assert(SourceRegion.BranchBlock->empty() &&
				"Expecting branch block to be empty!");
				SourceRegion.BranchBlock->eraseFromParent();

				assert(SourceRegion.FallThroughBlock->empty() &&
				"Expecting fall-through block to be empty!\n");
				SourceRegion.FallThroughBlock->eraseFromParent();

				NumBlocksCoalesced++;
				return true;
				}

				bool BranchCoalescing::runOnMachineFunction(MachineFunction &MF) {

				if (skipFunction(*MF.getFunction()) \|\| MF.empty() \|\|
				!isBranchCoalescingEnabled())
				return false;

				bool didSomething = false;

				#ifndef NDEBUG
				MF.verify(nullptr, "Error in code going into branch coalescing");
				#endif // NDEBUG

				DEBUG(dbgs() << "****** Branch Coalescing ******\n");
				initialize(MF);

				DEBUG(dbgs() << "Function: "; MF.dump(); dbgs() << "\n");

				CoalescingCandidateInfo Cand1, Cand2;
				// Walk over blocks and find candidates to merge
				// Continue trying to merge with the first candidate found, as long as merging
				// is successfull.
				for (MachineBasicBlock &MBB : MF) {
				bool MergedCandidates = false;
				do {
				MergedCandidates = false;
				Cand1.clear();
				Cand2.clear();

				Cand1.BranchBlock = &MBB;

				// If unable to analyze the branch, then continue to next block
				if (!analyzeBranch(Cand1))
				break;

				Cand2.BranchBlock = Cand1.BranchTargetBlock;
				if (!analyzeBranch(Cand2))
				break;

				// Sanity check
				// The branch-taken block of the second candidate should post-dominate the
				// first candidate
				assert(MPDT->dominates(Cand2.BranchTargetBlock, Cand1.BranchBlock) &&
				"Branch-taken block should post-dominate first candidate");
				nemanjaiUnsubmitted Not Done Reply Inline Actions Setting a bool to true and then using it in an if statement without any statements that could modify it in between is redundant. Besides, I think it's overly verbose to dump the entire function after every merge - I'd imagine that a dump after all the merging is done would suffice. nemanjai: Setting a bool to true and then using it in an if statement without any statements that could…
				leiAuthorUnsubmitted Done Reply Inline Actions For debug purposes, it would be nice to see the stages of merging... Will remove the if statement. lei: For debug purposes, it would be nice to see the stages of merging... Will remove the if…

				if (!identicalOperands(Cand1.Cond, Cand2.Cond)) {
				DEBUG(dbgs() << "Blocks " << Cand1.BranchBlock->getNumber() << " and "
				<< Cand2.BranchBlock->getNumber()
				<< " have different branches\n");
				break;
				}
				if (!canMerge(Cand2, Cand1)) {
				DEBUG(dbgs() << "Cannot merge blocks " << Cand1.BranchBlock->getNumber()
				<< " and " << Cand2.BranchBlock->getNumber() << "\n");
				NumBlocksNotCoalesced++;
				continue;
				}
				DEBUG(dbgs() << "Merging blocks " << Cand1.BranchBlock->getNumber()
				<< " and " << Cand1.BranchTargetBlock->getNumber() << "\n");
				MergedCandidates = mergeCandidates(Cand2, Cand1);
				if (MergedCandidates)
				didSomething = true;

				DEBUG(dbgs() << "Function after merging: "; MF.dump(); dbgs() << "\n");
				} while (MergedCandidates);
				}

				#ifndef NDEBUG
				kbartonUnsubmitted Done Reply Inline Actions Did we conclude whether this is different from the checks done by the verifyAfter parameter in addPass? kbarton: Did we conclude whether this is different from the checks done by the verifyAfter parameter in…
				leiAuthorUnsubmitted Done Reply Inline Actions Yes. This is different from the checks done by the verifyAfter parameter in addPass. The verifyAfter parameter for this pass is set to true by default. lei: Yes. This is different from the checks done by the verifyAfter parameter in addPass. The…
				MF.verify(nullptr, "Error in code produced by branch coalescing");
				#endif // NDEBUG

				DEBUG(dbgs() << "Finished Branch Coalescing\n");
				echristoUnsubmitted Done Reply Inline Actions This was to be turned on in a subsequent patch. Remove this and the ppc support please :) echristo: This was to be turned on in a subsequent patch. Remove this and the ppc support please :)
				leiAuthorUnsubmitted Not Done Reply Inline Actions see line 161 lei: see line 161
				return didSomething;
				}

lib/CodeGen/CMakeLists.txt

	add_llvm_library(LLVMCodeGen			add_llvm_library(LLVMCodeGen
	AggressiveAntiDepBreaker.cpp			AggressiveAntiDepBreaker.cpp
	AllocationOrder.cpp			AllocationOrder.cpp
	Analysis.cpp			Analysis.cpp
	AtomicExpandPass.cpp			AtomicExpandPass.cpp
	BasicTargetTransformInfo.cpp			BasicTargetTransformInfo.cpp
				BranchCoalescing.cpp
	BranchFolding.cpp			BranchFolding.cpp
	BranchRelaxation.cpp			BranchRelaxation.cpp
	BuiltinGCs.cpp			BuiltinGCs.cpp
	CalcSpillWeights.cpp			CalcSpillWeights.cpp
	CallingConvLower.cpp			CallingConvLower.cpp
	CodeGen.cpp			CodeGen.cpp
	CodeGenPrepare.cpp			CodeGenPrepare.cpp
	CountingFunctionInserter.cpp			CountingFunctionInserter.cpp
	▲ Show 20 Lines • Show All 146 Lines • Show Last 20 Lines

lib/CodeGen/CodeGen.cpp

	Show All 15 Lines
	#include "llvm-c/Initialization.h"			#include "llvm-c/Initialization.h"
	#include "llvm/PassRegistry.h"			#include "llvm/PassRegistry.h"

	using namespace llvm;			using namespace llvm;

	/// initializeCodeGen - Initialize all passes linked into the CodeGen library.			/// initializeCodeGen - Initialize all passes linked into the CodeGen library.
	void llvm::initializeCodeGen(PassRegistry &Registry) {			void llvm::initializeCodeGen(PassRegistry &Registry) {
	initializeAtomicExpandPass(Registry);			initializeAtomicExpandPass(Registry);
				initializeBranchCoalescingPass(Registry);
	initializeBranchFolderPassPass(Registry);			initializeBranchFolderPassPass(Registry);
	initializeBranchRelaxationPass(Registry);			initializeBranchRelaxationPass(Registry);
	initializeCodeGenPreparePass(Registry);			initializeCodeGenPreparePass(Registry);
	initializeCountingFunctionInserterPass(Registry);			initializeCountingFunctionInserterPass(Registry);
	initializeDeadMachineInstructionElimPass(Registry);			initializeDeadMachineInstructionElimPass(Registry);
	initializeDetectDeadLanesPass(Registry);			initializeDetectDeadLanesPass(Registry);
	initializeDwarfEHPreparePass(Registry);			initializeDwarfEHPreparePass(Registry);
	initializeEarlyIfConverterPass(Registry);			initializeEarlyIfConverterPass(Registry);
	▲ Show 20 Lines • Show All 67 Lines • Show Last 20 Lines

lib/CodeGen/TargetPassConfig.cpp

Show First 20 Lines • Show All 698 Lines • ▼ Show 20 Lines	void TargetPassConfig::addMachineSSAOptimization() {

// Allow targets to insert passes that improve instruction level parallelism,		// Allow targets to insert passes that improve instruction level parallelism,
// like if-conversion. Such passes will typically need dominator trees and		// like if-conversion. Such passes will typically need dominator trees and
// loop info, just like LICM and CSE below.		// loop info, just like LICM and CSE below.
addILPOpts();		addILPOpts();

addPass(&MachineLICMID, false);		addPass(&MachineLICMID, false);
addPass(&MachineCSEID, false);		addPass(&MachineCSEID, false);

		// Coalesce basic blocks with the same branch condition
		addPass(&BranchCoalescingID);

addPass(&MachineSinkingID);		addPass(&MachineSinkingID);

addPass(&PeepholeOptimizerID);		addPass(&PeepholeOptimizerID);
// Clean-up the dead code that may have been generated by peephole		// Clean-up the dead code that may have been generated by peephole
// rewriting.		// rewriting.
addPass(&DeadMachineInstructionElimID);		addPass(&DeadMachineInstructionElimID);
}		}

▲ Show 20 Lines • Show All 198 Lines • Show Last 20 Lines

test/CodeGen/PowerPC/branch_coalesce.ll

This file was added.

				; RUN: llc -mcpu=pwr8 -mtriple=powerpc64le-unknown-linux-gnu -verify-machineinstrs < %s \| FileCheck %s
				; RUN: llc -mcpu=pwr8 -mtriple=powerpc64-unknown-linux-gnu -verify-machineinstrs < %s \| FileCheck %s

				; Function Attrs: nounwind
				define double @testBranchCoal(double %a, double %b, double %c, i32 %x) {
				entry:
				%test = icmp eq i32 %x, 0
				%tmp1 = select i1 %test, double %a, double 2.000000e-03
				%tmp2 = select i1 %test, double %b, double 0.000000e+00
				%tmp3 = select i1 %test, double %c, double 5.000000e-03

				%res1 = fadd double %tmp1, %tmp2
				%result = fadd double %res1, %tmp3
				ret double %result

				; CHECK-LABEL: @testBranchCoal
				; CHECK: cmplwi [[CMPR:[0-7]+]], 6, 0
				; CHECK: beq [[CMPR]], .LBB[[LAB1:[0-9_]+]]
				; CHECK-DAG: addis [[LD1REG:[0-9]+]], 2, .LCPI0_0@toc@ha
				; CHECK-DAG: addis [[LD2REG:[0-9]+]], 2, .LCPI0_1@toc@ha
				; CHECK-DAG: xxlxor 2, 2, 2
				; CHECK-NOT: beq
				; CHECK-DAG: addi [[LD1BASE:[0-9]+]], [[LD1REG]]
				; CHECK-DAG: addi [[LD2BASE:[0-9]+]], [[LD2REG]]
				; CHECK-DAG: lxsdx 1, 0, [[LD1BASE]]
				; CHECK-DAG: lxsdx 3, 0, [[LD2BASE]]
				; CHECK: .LBB[[LAB1]]
				; CHECK: xsadddp 0, 1, 2
				; CHECK: xsadddp 1, 0, 3
				; CHECK: blr
				}

test/CodeGen/PowerPC/select-i1-vs-i1.ll

	Show First 20 Lines • Show All 1,020 Lines • ▼ Show 20 Lines
	define ppc_fp128 @testppc_fp128eq(ppc_fp128 %c1, ppc_fp128 %c2, ppc_fp128 %c3, ppc_fp128 %c4, ppc_fp128 %a1, ppc_fp128 %a2) #0 {			define ppc_fp128 @testppc_fp128eq(ppc_fp128 %c1, ppc_fp128 %c2, ppc_fp128 %c3, ppc_fp128 %c4, ppc_fp128 %a1, ppc_fp128 %a2) #0 {
	entry:			entry:
	%cmp1 = fcmp oeq ppc_fp128 %c3, %c4			%cmp1 = fcmp oeq ppc_fp128 %c3, %c4
	%cmp3tmp = fcmp oeq ppc_fp128 %c1, %c2			%cmp3tmp = fcmp oeq ppc_fp128 %c1, %c2
	%cmp3 = icmp eq i1 %cmp3tmp, %cmp1			%cmp3 = icmp eq i1 %cmp3tmp, %cmp1
	%cond = select i1 %cmp3, ppc_fp128 %a1, ppc_fp128 %a2			%cond = select i1 %cmp3, ppc_fp128 %a1, ppc_fp128 %a2
	ret ppc_fp128 %cond			ret ppc_fp128 %cond

	; FIXME: Because of the way that the late SELECT_* pseudo-instruction expansion			; The default branchCoalescing optimization merged the two same predicate blocks
	; works, we end up with two blocks with the same predicate. These could be			; that was expanded by the late SELECT_* pseudo-instruction expansion.
	; combined.

	; CHECK-LABEL: @testppc_fp128eq			; CHECK-LABEL: @testppc_fp128eq
	; CHECK-DAG: fcmpu {{[0-9]+}}, 6, 8			; CHECK-DAG: fcmpu {{[0-9]+}}, 6, 8
	; CHECK-DAG: fcmpu {{[0-9]+}}, 5, 7			; CHECK-DAG: fcmpu {{[0-9]+}}, 5, 7
	; CHECK-DAG: fcmpu {{[0-9]+}}, 2, 4			; CHECK-DAG: fcmpu {{[0-9]+}}, 2, 4
	; CHECK-DAG: fcmpu {{[0-9]+}}, 1, 3			; CHECK-DAG: fcmpu {{[0-9]+}}, 1, 3
	; CHECK: crand [[REG1:[0-9]+]], {{[0-9]+}}, {{[0-9]+}}			; CHECK: crand [[REG1:[0-9]+]], {{[0-9]+}}, {{[0-9]+}}
	; CHECK: crand [[REG2:[0-9]+]], {{[0-9]+}}, {{[0-9]+}}			; CHECK: crand [[REG2:[0-9]+]], {{[0-9]+}}, {{[0-9]+}}
	; CHECK: crxor [[REG3:[0-9]+]], [[REG2]], [[REG1]]			; CHECK: crxor [[REG3:[0-9]+]], [[REG2]], [[REG1]]
	; CHECK: bc 12, [[REG3]], .LBB[[BB1:[0-9_]+]]			; CHECK: bc 12, [[REG3]], .LBB[[BB1:[0-9_]+]]
	; CHECK: fmr 11, 9			; CHECK: fmr 11, 9
	; CHECK: .LBB[[BB1]]:
	; CHECK: bc 12, [[REG3]], .LBB[[BB2:[0-9_]+]]
	; CHECK: fmr 12, 10			; CHECK: fmr 12, 10
	; CHECK: .LBB[[BB2]]:			; CHECK: .LBB[[BB1]]:
	; CHECK-DAG: fmr 1, 11			; CHECK-DAG: fmr 1, 11
	; CHECK-DAG: fmr 2, 12			; CHECK-DAG: fmr 2, 12
	; CHECK: blr			; CHECK: blr
	}			}

	define <2 x double> @testv2doubleslt(float %c1, float %c2, float %c3, float %c4, <2 x double> %a1, <2 x double> %a2) #0 {			define <2 x double> @testv2doubleslt(float %c1, float %c2, float %c3, float %c4, <2 x double> %a1, <2 x double> %a2) #0 {
	entry:			entry:
	%cmp1 = fcmp oeq float %c3, %c4			%cmp1 = fcmp oeq float %c3, %c4
	▲ Show 20 Lines • Show All 742 Lines • Show Last 20 Lines

test/CodeGen/Thumb/select.ll

	Show First 20 Lines • Show All 68 Lines • ▼ Show 20 Lines

	define double @f7(double %a, double %b) {			define double @f7(double %a, double %b) {
	%tmp = fcmp olt double %a, 1.234e+00			%tmp = fcmp olt double %a, 1.234e+00
	%tmp1 = select i1 %tmp, double -1.000e+00, double %b			%tmp1 = select i1 %tmp, double -1.000e+00, double %b
	ret double %tmp1			ret double %tmp1
	}			}
	; CHECK-LABEL: f7:			; CHECK-LABEL: f7:
	; CHECK: blt			; CHECK: blt
	; CHECK: blt			; CHECK-NOT: blt
	; CHECK: __ltdf2			; CHECK: __ltdf2
	; CHECK-EABI-LABEL: f7:			; CHECK-EABI-LABEL: f7:
	; CHECK-EABI: __aeabi_dcmplt			; CHECK-EABI: __aeabi_dcmplt
	; CHECK-EABI: bne			; CHECK-EABI: bne
	; CHECK-EABI: bne			; CHECK-EABI-NOT: bne

This is an archive of the discontinued LLVM Phabricator instance.

Improve scheduling with branch coalescingClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 85784

include/llvm/CodeGen/Passes.h

include/llvm/InitializePasses.h

lib/CodeGen/BranchCoalescing.cpp

lib/CodeGen/CMakeLists.txt

lib/CodeGen/CodeGen.cpp

lib/CodeGen/TargetPassConfig.cpp

test/CodeGen/PowerPC/branch_coalesce.ll

test/CodeGen/PowerPC/select-i1-vs-i1.ll

test/CodeGen/Thumb/select.ll

Improve scheduling with branch coalescing
ClosedPublic