This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/CodeGen/
-
CodeGen/
3/14
BranchFolding.cpp
-
test/CodeGen/
-
CodeGen/
-
Mips/
-
eh.ll
-
SPARC/
-
missinglabel.ll
-
Thumb2/
-
thumb2-cbnz.ll
-
X86/
-
br-fold.ll
1
tail-merge-unreachable.ll

Differential D20379

Codegen: Fix broken assumption in Tail Merge.
ClosedPublic

Authored by iteratee on May 18 2016, 11:41 AM.

Download Raw Diff

Details

Reviewers

reames
mcrosier

Summary

Tail merge was making the assumption that a layout successor or
predecessor was always a cfg successor/predecessor. Remove that
assumption. Changes to tests are necessary because the errant cfg edges
were preventing optimizations.

Diff Detail

Event Timeline

iteratee updated this revision to Diff 57652.May 18 2016, 11:41 AM

iteratee retitled this revision from to Codegen: Fix broken assumption in Tail Merge..

iteratee updated this object.

iteratee added reviewers: mcrosier, reames.

iteratee set the repository for this revision to rL LLVM.

iteratee added subscribers: llvm-commits, sunfish.

Herald added subscribers: dsanders, jyknight. · View Herald TranscriptMay 18 2016, 11:41 AM

iteratee added subscribers: qcolombet, t.p.northover.May 18 2016, 11:45 AM

The following 4 tests don't have an obvious fix with this test. I was hoping to get some help by the test authors.

LLVM :: CodeGen/Thumb2/v8_IT_5.ll
LLVM :: CodeGen/WebAssembly/cfg-stackify.ll
LLVM :: CodeGen/X86/licm-dominance.ll
LLVM :: CodeGen/X86/wineh-coreclr.ll

licm-dominance is a real pain because the undefineds get loop-optimized away and invalidate the test assumptions. It looks like it was never correct.

haicheng added a subscriber: haicheng.May 18 2016, 12:06 PM

Feel free to disable CodeGen/WebAssembly/cfg-stackify.ll, such as by just removing the "| FileCheck %s" parts of the RUN lines, for now. I'm happy to fix the test and re-enable it myself later.

(This is a particularly hairy test. I'm hoping that MIR serialization will allow us to rewrite it in a less fragile way. But for now, it seems best to just disable it so that it doesn't block other work.)

Disabled cfg-stackify as requested.

Herald added a subscriber: jfb. · View Herald TranscriptMay 18 2016, 12:49 PM

In D20379#433442, @sunfish wrote:

Feel free to disable CodeGen/WebAssembly/cfg-stackify.ll, such as by just removing the "| FileCheck %s" parts of the RUN lines, for now. I'm happy to fix the test and re-enable it myself later.

(This is a particularly hairy test. I'm hoping that MIR serialization will allow us to rewrite it in a less fragile way. But for now, it seems best to just disable it so that it doesn't block other work.)

Done. Thanks.

iteratee set the repository for this revision to rL LLVM.May 18 2016, 12:50 PM

For CodeGen/X86/licm-dominance.ll I was specifically hoping that the test could be re-reduced with this change in place. I tried to create a test that would exercise the code in question, but I don't seem to understand licm well enough.

I just noticed that test/CodeGen/WebAssembly/mem-intrinsics.ll seems to be failing as well with this change.
If you can look at that as well, it would be awesome.

test/CodeGen/WebAssembly/mem-intrinsics.ll passes for me, with this patch applied. Let me know if there's anything I can help with.

Hi Kyle,

I think FallThrough does not have to be the CFG successor of MBB. It might be easier to understand it as the potential fallthrough block of either MBB or PrevBB.

Haicheng

lib/CodeGen/BranchFolding.cpp
1381	If you change FallThrough to MF.end(), I think you may miss this optimization. This piece of code tries to optimize this case PrevBB---MBB FallThrough \|____________\| Where MBB is the return block, PrevBB is both the layout predecessor and CFG predecessor of MBB, Fallthourgh is just a layout successor of MBB and must not be the CFG successor of MBB. In this case, MBB can be moved to the bottom of the MF and PrevBB can fallthrough to FallThrough.
1603	Similar situation here. Fallthrough is used to iterate all blocks below MBB in the layout to find the first non-EHPad. I think changing it to MF.end() may miss this one too.

Narrowed the check for FallThrough being an actual successor to the one place where this was assumed.

Thanks Haicheng.

lib/CodeGen/BranchFolding.cpp
1381	I moved the test to the empty block removal, which does assume that FallThrough is a CFG successor.

That only leaves licm-dominance and Thumb2/v8_IT_5.ll

The optimizer seems to be doing reasonable things in both of those cases.

In D20379#433905, @sunfish wrote:

test/CodeGen/WebAssembly/mem-intrinsics.ll passes for me, with this patch applied. Let me know if there's anything I can help with.

Thanks for looking at that. It was another change, sorry for the noise.

iteratee added a child revision: D20505: Codegen: Make chains from lattice-shaped CFGs.May 20 2016, 6:33 PM

Please see the inlined comment.

lib/CodeGen/BranchFolding.cpp
1258	I think we can enter here only if MBB is empty which means MBB does not have any branch. In this case, I think MBB has nowhere to go but fallthrough to FallThrough and this check is not necessary.

Barring comment from anyone else, I'm going to go ahead and remove

LLVM :: CodeGen/Thumb2/v8_IT_5.ll
LLVM :: CodeGen/X86/licm-dominance.ll

In licm-dominance it's clear that bugpoint has gone too far and the optimizer is being perfectly reasonable,
and it looks pretty reasonable for v8_IT_5.ll as well.

iteratee marked an inline comment as done.May 25 2016, 5:40 PM

iteratee added inline comments.

lib/CodeGen/BranchFolding.cpp
1258	This is actually what caused me to look into fallthrough in this file. An empty unreachable block has no successors, and replacing it with its layout successor can create invalid cfg edges.

haicheng added inline comments.May 26 2016, 7:25 AM

lib/CodeGen/BranchFolding.cpp
1258	That makes sense. Do you want to add a comment to explain this? Do you also want to create a test case for this? I think you don't have any test case yet to show what you want to fix.

AFAICT, all of the test updates are just side-effects of this patch. Therefore, I agree with Haicheng in that we need a test case that exposes the issue this patch is trying to address.

Added a test case, and another check where we weren't being explicit about cfg vs layout successor.

There's now a test for the codepath that handles empty blocks with no successors. I tested, and that seems to be the driver behind all the other test changes as well.

The two other code changes at 969 and 1270 were from looking over the file to see if there were other places where fallthrough was being assumed instead of actually tested.
I can split those out into a separate patch if you would prefer.

iteratee added a child revision: D18226: Codegen: Tail-duplicate during placement..May 27 2016, 2:25 PM

iteratee added a subscriber: kbarton.May 31 2016, 11:35 AM

Hi Kyle,

Do you have test cases for your change in line 969 and 1270?

Haicheng

lib/CodeGen/BranchFolding.cpp
969	My understanding is that PredBB is just the layout predecessor of MBB. The CFG predecessors of MBB are stored in MergePotentials and they are compared with PredBB to see if any of them is also the layout predecessor. I know the debug info is misleading.... Do you have a test case that the current code miscompiles?
1270	Is this check you added covered by the rest existing conditions? Do you have a test case for this?

Removed check after realizing it was redundant.

iteratee added inline comments.Jun 3 2016, 1:56 PM

lib/CodeGen/BranchFolding.cpp
999	No, there is no test case that changes, and nothing in the test suite changes hashes. However, the comments documenting the function disagree with you. See the comments on lines 796-797. The fact that nothing bad has happened so far isn't a good argument for not fixing the code to match the assumptions given in the comments.
1270	You're right it's covered. Looking at AnalyzeBranch, Cond.empty && !TBB implies fallthrough. I'll remove it.

iteratee added inline comments.Jun 6 2016, 2:33 PM

lib/CodeGen/BranchFolding.cpp
999	I need to get this in, so if you are set against this check, I can pull it out of this patch and worry about it later.

Sorry for the late response.

Have you updated the broken test cases? I ran your change on my machine and I could pass test-sharedidx.ll always-ext.ll thumb2-ifcvt3.ll

lib/CodeGen/BranchFolding.cpp
999	Comment 796-797 emphasizes if any. So, maybe we can worry about it later.
test/CodeGen/X86/tail-merge-unreachable.ll
16–19	Not directly related to your change. Do you think tail merging should merge sw.bb and sw.bb2? I know tail merging currently cannot merge empty blocks or unconditional branch only blocks. I may work on it as next.

Revert changes in tests that no longer require them.

I updated the test cases as requested.

lib/CodeGen/BranchFolding.cpp
999	That to me implies that it will be null if there is no such predecessor. I need to know if you want me to remove it from this patch.

I checked the modifications to the broken test cases.

I think you still break v8_IT_5.ll and licm-dominance.ll. What is your plan about these two?

Your modifications to missinglabel.ll and eh.ll are reasonable.

Your patch causes unnecessary branches in thumb2-cbnz.ll and br-fold.ll, but I think it is not your fault. I will try to create a patch to merge unreachable only blocks so that your change would not break these two.

lib/CodeGen/BranchFolding.cpp
999	Please remove it from this patch. I think it is not closely related to the rest of the change. I understand your concern and do not have strong opinion about it. If you insist, it might be better to do it in your next patch.

Split out only the portion with observable behavior change.

Remove two tests where the optimizer is completely valid to remove the assumptions.

OK, the two failing tests are gone, I don't think they were valid to begin with. Are there any other concerns about this change?

In D20379#457684, @iteratee wrote:

OK, the two failing tests are gone, I don't think they were valid to begin with. Are there any other concerns about this change?

In general, we don't want to reduce our test coverage. Please provide additional details as to why you believe these two tests aren't valid and why it's reasonable to delete them.

This revision now requires changes to proceed.Jun 15 2016, 8:12 AM

OK, the two failing tests are gone, I don't think they were valid to begin with. Are there any other concerns about this change?

In D20379#458728, @mcrosier wrote:

In D20379#457684, @iteratee wrote:

OK, the two failing tests are gone, I don't think they were valid to begin with. Are there any other concerns about this change?

In general, we don't want to reduce our test coverage. Please provide additional details as to why you believe these two tests aren't valid and why it's reasonable to delete them.

In general, we want coverage by correct tests. Incorrect tests are a liability, not an asset.

OK, let's take them one at a time:
licm-dominance:
This test has NEVER been correct. I've checked the entire version history. The goal of the test is to verify that LICM checks the dominance relation to make sure a load is guaranteed to be executed. The problem with the test is that with all the undefineds, the compiler is free to make sure the load is guaranteed to be executed. The test also relies on the particular behavior of undefined as setting eax to 0

Thumb2/v8_IT_5.ll:
There are 4 other v8_it tests, but this one suffers from a similar problem as the one above. It has too many undefineds for the assumptions about the resulting code to ever be correct.

I feel I've given plenty of time for people to respond to the tests in question. On May 19, almost a month ago I said:
"""
That only leaves licm-dominance and Thumb2/v8_IT_5.ll

The optimizer seems to be doing reasonable things in both of those cases.

I don't really want to remove tests from the regression suite, but the tests seem to be relying on bad assumptions, and I'm not the best person to fix them.
I could disable the FileCheck lines and open bugs for them.
"""
and a week later on the 25th I said:
"""
Barring comment from anyone else, I'm going to go ahead and remove

LLVM :: CodeGen/Thumb2/v8_IT_5.ll
LLVM :: CodeGen/X86/licm-dominance.ll
In licm-dominance it's clear that bugpoint has gone too far and the optimizer is being perfectly reasonable,
and it looks pretty reasonable for v8_IT_5.ll as well.
"""
A comment from you then would have been very helpful.

In D20379#458907, @iteratee wrote:

OK, the two failing tests are gone, I don't think they were valid to begin with. Are there any other concerns about this change?

In D20379#458728, @mcrosier wrote:

In D20379#457684, @iteratee wrote:

OK, the two failing tests are gone, I don't think they were valid to begin with. Are there any other concerns about this change?

In general, we don't want to reduce our test coverage. Please provide additional details as to why you believe these two tests aren't valid and why it's reasonable to delete them.

In general, we want coverage by correct tests. Incorrect tests are a liability, not an asset.

I agree. Please understand I just want to make sure we're doing our due diligence here. I'm not trying to impeded your progress.

OK, let's take them one at a time:
licm-dominance:
This test has NEVER been correct. I've checked the entire version history. The goal of the test is to verify that LICM checks the dominance relation to make sure a load is guaranteed to be executed. The problem with the test is that with all the undefineds, the compiler is free to make sure the load is guaranteed to be executed. The test also relies on the particular behavior of undefined as setting eax to 0

After further investigation I tend to agree that bugpoint was overly aggressive and the reduced test case doesn't appear to be actually testing anything.

Thumb2/v8_IT_5.ll:
There are 4 other v8_it tests, but this one suffers from a similar problem as the one above. It has too many undefineds for the assumptions about the resulting code to ever be correct.

I think v8_IT_5.ll could have been better written, but I believe it is testing what it is intended to test. Specifically, that we can predicate a tCMPi8 instruction.

I think the critical checks are:

; CHECK: it ne
; CHECK-NEXT: cmpne
; CHECK-NEXT: bne [[JUMPTARGET:.LBB[0-9]+_[0-9]+]]

If you edit isV8EligibleForIT() in ARMFeatures.h to return false for ARM::tCMPi8 you'll break this test, which is the regression we're trying to avoid. We'll generate code like the following:

cmp     r0, #3
beq     .LBB0_3

cmp     r0, #1
beq     .LBB0_3

...

IMO, we should figure out how to fix the test so it continues to test this behavior while passing with your patch.

I've reduced the test a little without changing the CHECKs. Let me know if this works with your patch.

; RUN: llc < %s -mtriple=thumbv8 -arm-atomic-cfg-tidy=0 | FileCheck %s
; RUN: llc < %s -mtriple=thumbv7 -arm-atomic-cfg-tidy=0 -arm-restrict-it | FileCheck %s
; CHECK: it	ne
; CHECK-NEXT: cmpne
; CHECK-NEXT: bne [[JUMPTARGET:.LBB[0-9]+_[0-9]+]]
; CHECK: cbz
; CHECK-NEXT: %if.else163
; CHECK-NEXT: mov.w
; CHECK-NEXT: b
; CHECK: [[JUMPTARGET]]:{{.*}}%if.else173
; CHECK-NEXT: mov.w
; CHECK-NEXT: bx lr
; CHECK-NEXT: %if.else145
; CHECK-NEXT: mov.w

%struct.hc = type { i32, i32, i32, i32 }

define i32 @t(i32 %type) optsize {
entry:
  switch i32 %type, label %if.else173 [
    i32 3, label %if.then115
    i32 1, label %if.then102
  ]

if.then102:
  unreachable

if.then115:
  br i1 undef, label %if.else163, label %if.else145

if.else145:
  %call150 = call fastcc %struct.hc* @foo(%struct.hc* undef, i32 34865152) optsize
  br label %while.body172

if.else163:
  %call168 = call fastcc %struct.hc* @foo(%struct.hc* undef, i32 34078720) optsize
  br label %while.body172

while.body172:
  br label %while.body172

if.else173:
  ret i32 -1
}

declare hidden fastcc %struct.hc* @foo(%struct.hc* nocapture, i32) nounwind optsize

Chad

I feel I've given plenty of time for people to respond to the tests in question. On May 19, almost a month ago I said:
"""
That only leaves licm-dominance and Thumb2/v8_IT_5.ll

The optimizer seems to be doing reasonable things in both of those cases.

I don't really want to remove tests from the regression suite, but the tests seem to be relying on bad assumptions, and I'm not the best person to fix them.
I could disable the FileCheck lines and open bugs for them.
"""
and a week later on the 25th I said:
"""
Barring comment from anyone else, I'm going to go ahead and remove

LLVM :: CodeGen/Thumb2/v8_IT_5.ll
LLVM :: CodeGen/X86/licm-dominance.ll
In licm-dominance it's clear that bugpoint has gone too far and the optimizer is being perfectly reasonable,
and it looks pretty reasonable for v8_IT_5.ll as well.
"""
A comment from you then would have been very helpful.

I have a fixed version of v8_IT_5.ll
Your short version made it obvious that they're relying on then102 and
then115 becoming the same destination.

Maybe you can help me with licm-dominance. I can't seem to write a load in
IR that ends up being considered as invariant to save my life.
https://xkcd.com/1168/

I can tag the loads as invariant, and then the flags are correct, but for
the load to be *actually* invariant it also has to be "dereferenceable".
I'm not certain what that means or how to write it into the IR. Pointers
would be useful.

Err, are you talking about https://github.com/llvm-mirror/llvm/blob/961fcb527d3c49bfcf9d6ff212cca3dc15682dbe/lib/CodeGen/MachineLICM.cpp#L869 ? You can't write a constant pool load in IR. You can write something like store <4 x i32> <i32 1, i32 2, i32 3, i32 4>, <4 x i32*> @g, and SelectionDAG will generate a constant pool load.

I figured this out with help from IRC.

iteratee added a parent revision: D21448: Codegen: LICM Remove check for exactly 1 register def..Jun 16 2016, 1:48 PM

Replaced the two deleted tests, now better than ever.

mcrosier, thanks for your help with the v8 test. When you shrunk it, it was obvious what was intended, and what needed to change.
As an aside, we can't use 1 and 3 any more, as the compiler gets clever with constants, oring with 2 and then checking against 3. Hence 6 and 13.

OK, no removed tests, and no more dependent revisions.
Any other concerns?

It looks good to me now. Thank you for working on this, Kyle. I will create a patch to cleanup the extra branches later.

mcrosier accepted this revision.Jun 24 2016, 9:43 AM

mcrosier edited edge metadata.

This revision is now accepted and ready to land.Jun 24 2016, 9:43 AM

iteratee closed this revision.Jun 27 2016, 12:46 PM

iteratee removed a child revision: D18226: Codegen: Tail-duplicate during placement..Jun 28 2016, 4:07 PM

iteratee removed a child revision: D20505: Codegen: Make chains from lattice-shaped CFGs.Nov 1 2016, 4:39 PM

Revision Contents

Path

Size

lib/

CodeGen/

BranchFolding.cpp

2 lines

test/

CodeGen/

Mips/

eh.ll

2 lines

SPARC/

missinglabel.ll

4 lines

Thumb2/

thumb2-cbnz.ll

4 lines

X86/

br-fold.ll

2 lines

tail-merge-unreachable.ll

34 lines

Diff 60560

lib/CodeGen/BranchFolding.cpp

Show First 20 Lines • Show All 960 Lines • ▼ Show 20 Lines	if (!AfterBlockPlacement) {
}		}

// If this is a large problem, avoid visiting the same basic blocks		// If this is a large problem, avoid visiting the same basic blocks
// multiple times.		// multiple times.
if (MergePotentials.size() == TailMergeThreshold)		if (MergePotentials.size() == TailMergeThreshold)
for (unsigned i = 0, e = MergePotentials.size(); i != e; ++i)		for (unsigned i = 0, e = MergePotentials.size(); i != e; ++i)
TriedMerging.insert(MergePotentials[i].getBlock());		TriedMerging.insert(MergePotentials[i].getBlock());

// See if we can do any tail merging on those.		// See if we can do any tail merging on those.
		haichengUnsubmitted Not Done Reply Inline Actions My understanding is that PredBB is just the layout predecessor of MBB. The CFG predecessors of MBB are stored in MergePotentials and they are compared with PredBB to see if any of them is also the layout predecessor. I know the debug info is misleading.... Do you have a test case that the current code miscompiles? haicheng: My understanding is that PredBB is just the layout predecessor of MBB. The CFG predecessors of…
if (MergePotentials.size() >= 2)		if (MergePotentials.size() >= 2)
MadeChange \|= TryTailMergeBlocks(nullptr, nullptr);		MadeChange \|= TryTailMergeBlocks(nullptr, nullptr);
}		}

// Look at blocks (IBB) with multiple predecessors (PBB).		// Look at blocks (IBB) with multiple predecessors (PBB).
// We change each predecessor to a canonical form, by		// We change each predecessor to a canonical form, by
// (1) temporarily removing any unconditional branch from the predecessor		// (1) temporarily removing any unconditional branch from the predecessor
// to IBB, and		// to IBB, and
Show All 13 Lines	bool BranchFolder::TailMergeBlocks(MachineFunction &MF) {
// transformations.)		// transformations.)

for (MachineFunction::iterator I = std::next(MF.begin()), E = MF.end();		for (MachineFunction::iterator I = std::next(MF.begin()), E = MF.end();
I != E; ++I) {		I != E; ++I) {
if (I->pred_size() < 2) continue;		if (I->pred_size() < 2) continue;
SmallPtrSet<MachineBasicBlock *, 8> UniquePreds;		SmallPtrSet<MachineBasicBlock *, 8> UniquePreds;
MachineBasicBlock IBB = &I;		MachineBasicBlock IBB = &I;
MachineBasicBlock PredBB = &std::prev(I);		MachineBasicBlock PredBB = &std::prev(I);
MergePotentials.clear();		MergePotentials.clear();
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions No, there is no test case that changes, and nothing in the test suite changes hashes. However, the comments documenting the function disagree with you. See the comments on lines 796-797. The fact that nothing bad has happened so far isn't a good argument for not fixing the code to match the assumptions given in the comments. iteratee: No, there is no test case that changes, and nothing in the test suite changes hashes. However…
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions I need to get this in, so if you are set against this check, I can pull it out of this patch and worry about it later. iteratee: I need to get this in, so if you are set against this check, I can pull it out of this patch…
		haichengUnsubmitted Not Done Reply Inline Actions Comment 796-797 emphasizes if any. So, maybe we can worry about it later. haicheng: Comment 796-797 emphasizes if any. So, maybe we can worry about it later.
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions That to me implies that it will be null if there is no such predecessor. I need to know if you want me to remove it from this patch. iteratee: That to me implies that it will be null if there is no such predecessor. I need to know if you…
		haichengUnsubmitted Not Done Reply Inline Actions Please remove it from this patch. I think it is not closely related to the rest of the change. I understand your concern and do not have strong opinion about it. If you insist, it might be better to do it in your next patch. haicheng: Please remove it from this patch. I think it is not closely related to the rest of the change.
for (MachineBasicBlock *PBB : I->predecessors()) {		for (MachineBasicBlock *PBB : I->predecessors()) {
if (MergePotentials.size() == TailMergeThreshold)		if (MergePotentials.size() == TailMergeThreshold)
break;		break;

if (TriedMerging.count(PBB))		if (TriedMerging.count(PBB))
continue;		continue;

// Skip blocks that loop to themselves, can't tail merge these.		// Skip blocks that loop to themselves, can't tail merge these.
▲ Show 20 Lines • Show All 242 Lines • ▼ Show 20 Lines	if (IsEmptyBlock(MBB) && !MBB->isEHPad() && !MBB->hasAddressTaken() &&

if (FallThrough == MF.end()) {		if (FallThrough == MF.end()) {
// TODO: Simplify preds to not branch here if possible!		// TODO: Simplify preds to not branch here if possible!
} else if (FallThrough->isEHPad()) {		} else if (FallThrough->isEHPad()) {
// Don't rewrite to a landing pad fallthough. That could lead to the case		// Don't rewrite to a landing pad fallthough. That could lead to the case
// where a BB jumps to more than one landing pad.		// where a BB jumps to more than one landing pad.
// TODO: Is it ever worth rewriting predecessors which don't already		// TODO: Is it ever worth rewriting predecessors which don't already
// jump to a landing pad, and so can safely jump to the fallthrough?		// jump to a landing pad, and so can safely jump to the fallthrough?
} else {		} else if (MBB->isSuccessor(&*FallThrough)) {
		haichengUnsubmitted Done Reply Inline Actions I think we can enter here only if MBB is empty which means MBB does not have any branch. In this case, I think MBB has nowhere to go but fallthrough to FallThrough and this check is not necessary. haicheng: I think we can enter here only if MBB is empty which means MBB does not have any branch. In…
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions This is actually what caused me to look into fallthrough in this file. An empty unreachable block has no successors, and replacing it with its layout successor can create invalid cfg edges. iteratee: This is actually what caused me to look into fallthrough in this file. An empty unreachable…
		haichengUnsubmitted Not Done Reply Inline Actions That makes sense. Do you want to add a comment to explain this? Do you also want to create a test case for this? I think you don't have any test case yet to show what you want to fix. haicheng: That makes sense. Do you want to add a comment to explain this? Do you also want to create a…
// Rewrite all predecessors of the old block to go to the fallthrough		// Rewrite all predecessors of the old block to go to the fallthrough
// instead.		// instead.
while (!MBB->pred_empty()) {		while (!MBB->pred_empty()) {
MachineBasicBlock Pred = (MBB->pred_end()-1);		MachineBasicBlock Pred = (MBB->pred_end()-1);
Pred->ReplaceUsesOfBlockWith(MBB, &*FallThrough);		Pred->ReplaceUsesOfBlockWith(MBB, &*FallThrough);
}		}
// If MBB was the target of a jump table, update jump tables to go to the		// If MBB was the target of a jump table, update jump tables to go to the
// fallthrough instead.		// fallthrough instead.
if (MachineJumpTableInfo *MJTI = MF.getJumpTableInfo())		if (MachineJumpTableInfo *MJTI = MF.getJumpTableInfo())
MJTI->ReplaceMBBInJumpTables(MBB, &*FallThrough);		MJTI->ReplaceMBBInJumpTables(MBB, &*FallThrough);
MadeChange = true;		MadeChange = true;
}		}
		haichengUnsubmitted Not Done Reply Inline Actions Is this check you added covered by the rest existing conditions? Do you have a test case for this? haicheng: Is this check you added covered by the rest existing conditions? Do you have a test case for…
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions You're right it's covered. Looking at AnalyzeBranch, Cond.empty && !TBB implies fallthrough. I'll remove it. iteratee: You're right it's covered. Looking at AnalyzeBranch, Cond.empty && !TBB implies fallthrough.
return MadeChange;		return MadeChange;
}		}

// Check to see if we can simplify the terminator of the block before this		// Check to see if we can simplify the terminator of the block before this
// one.		// one.
MachineBasicBlock &PrevBB = *std::prev(MachineFunction::iterator(MBB));		MachineBasicBlock &PrevBB = *std::prev(MachineFunction::iterator(MBB));

MachineBasicBlock PriorTBB = nullptr, PriorFBB = nullptr;		MachineBasicBlock PriorTBB = nullptr, PriorFBB = nullptr;
▲ Show 20 Lines • Show All 94 Lines • ▼ Show 20 Lines	if (!PriorUnAnalyzable) {
// a call to a no-return function like abort or __cxa_throw) and if the pred		// a call to a no-return function like abort or __cxa_throw) and if the pred
// falls through into this block, and if it would otherwise fall through		// falls through into this block, and if it would otherwise fall through
// into the block after this, move this block to the end of the function.		// into the block after this, move this block to the end of the function.
//		//
// We consider it more likely that execution will stay in the function (e.g.		// We consider it more likely that execution will stay in the function (e.g.
// due to loops) than it is to exit it. This asserts in loops etc, moving		// due to loops) than it is to exit it. This asserts in loops etc, moving
// the assert condition out of the loop body.		// the assert condition out of the loop body.
if (MBB->succ_empty() && !PriorCond.empty() && !PriorFBB &&		if (MBB->succ_empty() && !PriorCond.empty() && !PriorFBB &&
MachineFunction::iterator(PriorTBB) == FallThrough &&		MachineFunction::iterator(PriorTBB) == FallThrough &&
		haichengUnsubmitted Done Reply Inline Actions If you change FallThrough to MF.end(), I think you may miss this optimization. This piece of code tries to optimize this case PrevBB---MBB FallThrough \|____________\| Where MBB is the return block, PrevBB is both the layout predecessor and CFG predecessor of MBB, Fallthourgh is just a layout successor of MBB and must not be the CFG successor of MBB. In this case, MBB can be moved to the bottom of the MF and PrevBB can fallthrough to FallThrough. haicheng: If you change FallThrough to MF.end(), I think you may miss this optimization. This piece of…
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions I moved the test to the empty block removal, which does assume that FallThrough is a CFG successor. iteratee: I moved the test to the empty block removal, which does assume that FallThrough is a CFG…
!MBB->canFallThrough()) {		!MBB->canFallThrough()) {
bool DoTransform = true;		bool DoTransform = true;

// We have to be careful that the succs of PredBB aren't both no-successor		// We have to be careful that the succs of PredBB aren't both no-successor
// blocks. If neither have successors and if PredBB is the second from		// blocks. If neither have successors and if PredBB is the second from
// last block in the function, we'd just keep swapping the two blocks for		// last block in the function, we'd just keep swapping the two blocks for
// last. Only do the swap if one is clearly better to fall through than		// last. Only do the swap if one is clearly better to fall through than
// the other.		// the other.
▲ Show 20 Lines • Show All 205 Lines • ▼ Show 20 Lines	if (!CurFallsThru) {
// removed, move this block to the end of the function.		// removed, move this block to the end of the function.
MachineBasicBlock PrevTBB = nullptr, PrevFBB = nullptr;		MachineBasicBlock PrevTBB = nullptr, PrevFBB = nullptr;
SmallVector<MachineOperand, 4> PrevCond;		SmallVector<MachineOperand, 4> PrevCond;
// We're looking for cases where PrevBB could possibly fall through to		// We're looking for cases where PrevBB could possibly fall through to
// FallThrough, but if FallThrough is an EH pad that wouldn't be useful		// FallThrough, but if FallThrough is an EH pad that wouldn't be useful
// so here we skip over any EH pads so we might have a chance to find		// so here we skip over any EH pads so we might have a chance to find
// a branch target from PrevBB.		// a branch target from PrevBB.
while (FallThrough != MF.end() && FallThrough->isEHPad())		while (FallThrough != MF.end() && FallThrough->isEHPad())
++FallThrough;		++FallThrough;
		haichengUnsubmitted Done Reply Inline Actions Similar situation here. Fallthrough is used to iterate all blocks below MBB in the layout to find the first non-EHPad. I think changing it to MF.end() may miss this one too. haicheng: Similar situation here. Fallthrough is used to iterate all blocks below MBB in the layout to…
// Now check to see if the current block is sitting between PrevBB and		// Now check to see if the current block is sitting between PrevBB and
// a block to which it could fall through.		// a block to which it could fall through.
if (FallThrough != MF.end() &&		if (FallThrough != MF.end() &&
!TII->AnalyzeBranch(PrevBB, PrevTBB, PrevFBB, PrevCond, true) &&		!TII->AnalyzeBranch(PrevBB, PrevTBB, PrevFBB, PrevCond, true) &&
PrevBB.isSuccessor(&*FallThrough)) {		PrevBB.isSuccessor(&*FallThrough)) {
MBB->moveAfter(&MF.back());		MBB->moveAfter(&MF.back());
MadeChange = true;		MadeChange = true;
return MadeChange;		return MadeChange;
▲ Show 20 Lines • Show All 308 Lines • Show Last 20 Lines

test/CodeGen/Mips/eh.ll

Show All 18 Lines	; CHECK-EL: .cfi_offset 31, -12
%exception = tail call i8* @__cxa_allocate_exception(i32 8) nounwind		%exception = tail call i8* @__cxa_allocate_exception(i32 8) nounwind
%0 = bitcast i8* %exception to double*		%0 = bitcast i8* %exception to double*
store double 3.200000e+00, double* %0, align 8		store double 3.200000e+00, double* %0, align 8
invoke void @__cxa_throw(i8* %exception, i8* bitcast (i8** @_ZTId to i8), i8 null) noreturn		invoke void @__cxa_throw(i8* %exception, i8* bitcast (i8** @_ZTId to i8), i8 null) noreturn
to label %unreachable unwind label %lpad		to label %unreachable unwind label %lpad

lpad: ; preds = %entry		lpad: ; preds = %entry
; CHECK-EL: # %lpad		; CHECK-EL: # %lpad
; CHECK-EL: beq $5		; CHECK-EL: bne $5

%exn.val = landingpad { i8*, i32 }		%exn.val = landingpad { i8*, i32 }
cleanup		cleanup
catch i8* bitcast (i8** @_ZTId to i8*)		catch i8* bitcast (i8** @_ZTId to i8*)
%exn = extractvalue { i8*, i32 } %exn.val, 0		%exn = extractvalue { i8*, i32 } %exn.val, 0
%sel = extractvalue { i8*, i32 } %exn.val, 1		%sel = extractvalue { i8*, i32 } %exn.val, 1
%1 = tail call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTId to i8*)) nounwind		%1 = tail call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTId to i8*)) nounwind
%2 = icmp eq i32 %sel, %1		%2 = icmp eq i32 %sel, %1
Show All 29 Lines

test/CodeGen/SPARC/missinglabel.ll

	; RUN: llc < %s -verify-machineinstrs \| FileCheck %s			; RUN: llc < %s -verify-machineinstrs \| FileCheck %s
	target datalayout = "E-m:e-i64:64-n32:64-S128"			target datalayout = "E-m:e-i64:64-n32:64-S128"
	target triple = "sparc64-unknown-linux-gnu"			target triple = "sparc64-unknown-linux-gnu"

	define void @f() align 2 {			define void @f() align 2 {
	entry:			entry:
	; CHECK: %xcc, .LBB0_1			; CHECK: %xcc, .LBB0_2
	%cmp = icmp eq i64 undef, 0			%cmp = icmp eq i64 undef, 0
	br i1 %cmp, label %targetblock, label %cond.false			br i1 %cmp, label %targetblock, label %cond.false

	cond.false:			cond.false:
	unreachable			unreachable

	; CHECK: .LBB0_1: ! %targetblock			; CHECK: .LBB0_2: ! %targetblock
	targetblock:			targetblock:
	br i1 undef, label %cond.false.i83, label %exit.i85			br i1 undef, label %cond.false.i83, label %exit.i85

	cond.false.i83:			cond.false.i83:
	unreachable			unreachable

	exit.i85:			exit.i85:
	unreachable			unreachable
	}			}

test/CodeGen/Thumb2/thumb2-cbnz.ll

	; RUN: llc < %s -mtriple=thumbv7-apple-darwin -mcpu=cortex-a8 -arm-atomic-cfg-tidy=0 \| FileCheck %s			; RUN: llc < %s -mtriple=thumbv7-apple-darwin -mcpu=cortex-a8 -arm-atomic-cfg-tidy=0 \| FileCheck %s
	; rdar://7354379			; rdar://7354379

	declare double @foo(double) nounwind readnone			declare double @foo(double) nounwind readnone

	define void @t(i32 %c, double %b) {			define void @t(i32 %c, double %b) {
	entry:			entry:
				; CHECK: cmp r0, #0
	%cmp1 = icmp ne i32 %c, 0			%cmp1 = icmp ne i32 %c, 0
	br i1 %cmp1, label %bb3, label %bb1			br i1 %cmp1, label %bb3, label %bb1

	bb1: ; preds = %entry			bb1: ; preds = %entry
	unreachable			unreachable

	bb3: ; preds = %entry			bb3: ; preds = %entry
	%cmp2 = icmp ne i32 %c, 0			%cmp2 = icmp ne i32 %c, 0
	br i1 %cmp2, label %bb7, label %bb5			br i1 %cmp2, label %bb7, label %bb5

	bb5: ; preds = %bb3			bb5: ; preds = %bb3
	unreachable			unreachable

	bb7: ; preds = %bb3			bb7: ; preds = %bb3
	%cmp3 = icmp ne i32 %c, 0			%cmp3 = icmp ne i32 %c, 0
	br i1 %cmp3, label %bb11, label %bb9			br i1 %cmp3, label %bb11, label %bb9

	bb9: ; preds = %bb7			bb9: ; preds = %bb7
	; CHECK: cmp r0, #0			; CHECK: cbnz
	; CHECK-NEXT: cbnz
	%0 = tail call double @foo(double %b) nounwind readnone ; <double> [#uses=0]			%0 = tail call double @foo(double %b) nounwind readnone ; <double> [#uses=0]
	br label %bb11			br label %bb11

	bb11: ; preds = %bb9, %bb7			bb11: ; preds = %bb9, %bb7
	%1 = getelementptr i32, i32* undef, i32 0			%1 = getelementptr i32, i32* undef, i32 0
	store i32 0, i32* %1			store i32 0, i32* %1
	ret void			ret void
	}			}

test/CodeGen/X86/br-fold.ll

	; RUN: llc -mtriple=x86_64-apple-darwin < %s \| FileCheck -check-prefix=X64_DARWIN %s			; RUN: llc -mtriple=x86_64-apple-darwin < %s \| FileCheck -check-prefix=X64_DARWIN %s
	; RUN: llc -mtriple=x86_64-pc-linux < %s \| FileCheck -check-prefix=X64_LINUX %s			; RUN: llc -mtriple=x86_64-pc-linux < %s \| FileCheck -check-prefix=X64_LINUX %s
	; RUN: llc -mtriple=x86_64-pc-windows < %s \| FileCheck -check-prefix=X64_WINDOWS %s			; RUN: llc -mtriple=x86_64-pc-windows < %s \| FileCheck -check-prefix=X64_WINDOWS %s
	; RUN: llc -mtriple=x86_64-pc-windows-gnu < %s \| FileCheck -check-prefix=X64_WINDOWS_GNU %s			; RUN: llc -mtriple=x86_64-pc-windows-gnu < %s \| FileCheck -check-prefix=X64_WINDOWS_GNU %s
	; RUN: llc -mtriple=x86_64-scei-ps4 < %s \| FileCheck -check-prefix=PS4 %s			; RUN: llc -mtriple=x86_64-scei-ps4 < %s \| FileCheck -check-prefix=PS4 %s

	; X64_DARWIN: orq			; X64_DARWIN: orq
				; X64_DARWIN-NEXT: jne
	; X64_DARWIN-NEXT: %bb8.i329			; X64_DARWIN-NEXT: %bb8.i329

	; X64_LINUX: orq %rax, %rcx			; X64_LINUX: orq %rax, %rcx
				; X64_LINUX-NEXT: jne
	; X64_LINUX-NEXT: %bb8.i329			; X64_LINUX-NEXT: %bb8.i329

	; X64_WINDOWS: orq %rax, %rcx			; X64_WINDOWS: orq %rax, %rcx
	; X64_WINDOWS-NEXT: ud2			; X64_WINDOWS-NEXT: ud2

	; X64_WINDOWS_GNU: orq %rax, %rcx			; X64_WINDOWS_GNU: orq %rax, %rcx
	; X64_WINDOWS_GNU-NEXT: ud2			; X64_WINDOWS_GNU-NEXT: ud2

	Show All 18 Lines

test/CodeGen/X86/tail-merge-unreachable.ll

This file was added.

				; RUN: llc -mtriple=x86_64-linux-gnu %s -o - -verify-machineinstrs \| FileCheck %s

				define i32 @tail_merge_unreachable(i32 %i) {
				entry:
				br i1 undef, label %sw, label %end
				sw:
				switch i32 %i, label %end [
				i32 99, label %sw.bb
				i32 98, label %sw.bb
				i32 101, label %sw.bb
				i32 97, label %sw.bb2
				i32 96, label %sw.bb2
				i32 100, label %sw.bb2
				]
				sw.bb:
				unreachable
				sw.bb2:
				unreachable
				end:
				haichengUnsubmitted Not Done Reply Inline Actions Not directly related to your change. Do you think tail merging should merge sw.bb and sw.bb2? I know tail merging currently cannot merge empty blocks or unconditional branch only blocks. I may work on it as next. haicheng: Not directly related to your change. Do you think tail merging should merge sw.bb and sw.bb2?
				%p = phi i32 [ 1, %sw ], [ 0, %entry ]
				ret i32 %p

				; CHECK-LABEL: tail_merge_unreachable:
				; Range Check
				; CHECK: addl $-96
				; CHECK: cmpl $5
				; CHECK: jbe [[JUMP_TABLE_BLOCK:[.][A-Za-z0-9_]+]]
				; CHECK: retq
				; CHECK: [[JUMP_TABLE_BLOCK]]:
				; CHECK: btl
				; CHECK: jae [[UNREACHABLE_BLOCK:[.][A-Za-z0-9_]+]]
				; CHECK [[UNREACHABLE_BLOCK]]:
				; CHECK: .Lfunc_end0
				}

This is an archive of the discontinued LLVM Phabricator instance.

Codegen: Fix broken assumption in Tail Merge.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 60560

lib/CodeGen/BranchFolding.cpp

test/CodeGen/Mips/eh.ll

test/CodeGen/SPARC/missinglabel.ll

test/CodeGen/Thumb2/thumb2-cbnz.ll

test/CodeGen/X86/br-fold.ll

test/CodeGen/X86/tail-merge-unreachable.ll

Codegen: Fix broken assumption in Tail Merge.
ClosedPublic