This is an archive of the discontinued LLVM Phabricator instance.

Codegen: Fix broken assumption in Tail Merge.
ClosedPublic

Authored by iteratee on May 18 2016, 11:41 AM.

Details

Reviewers
reames
mcrosier
Summary

Tail merge was making the assumption that a layout successor or
predecessor was always a cfg successor/predecessor. Remove that
assumption. Changes to tests are necessary because the errant cfg edges
were preventing optimizations.

Diff Detail

Event Timeline

iteratee updated this revision to Diff 57652.May 18 2016, 11:41 AM
iteratee retitled this revision from to Codegen: Fix broken assumption in Tail Merge..
iteratee updated this object.
iteratee added reviewers: mcrosier, reames.
iteratee set the repository for this revision to rL LLVM.
iteratee added subscribers: llvm-commits, sunfish.

The following 4 tests don't have an obvious fix with this test. I was hoping to get some help by the test authors.

LLVM :: CodeGen/Thumb2/v8_IT_5.ll
LLVM :: CodeGen/WebAssembly/cfg-stackify.ll
LLVM :: CodeGen/X86/licm-dominance.ll
LLVM :: CodeGen/X86/wineh-coreclr.ll

licm-dominance is a real pain because the undefineds get loop-optimized away and invalidate the test assumptions. It looks like it was never correct.

Feel free to disable CodeGen/WebAssembly/cfg-stackify.ll, such as by just removing the "| FileCheck %s" parts of the RUN lines, for now. I'm happy to fix the test and re-enable it myself later.

(This is a particularly hairy test. I'm hoping that MIR serialization will allow us to rewrite it in a less fragile way. But for now, it seems best to just disable it so that it doesn't block other work.)

iteratee updated this revision to Diff 57664.May 18 2016, 12:49 PM
iteratee removed rL LLVM as the repository for this revision.

Disabled cfg-stackify as requested.

Feel free to disable CodeGen/WebAssembly/cfg-stackify.ll, such as by just removing the "| FileCheck %s" parts of the RUN lines, for now. I'm happy to fix the test and re-enable it myself later.

(This is a particularly hairy test. I'm hoping that MIR serialization will allow us to rewrite it in a less fragile way. But for now, it seems best to just disable it so that it doesn't block other work.)

Done. Thanks.

iteratee set the repository for this revision to rL LLVM.May 18 2016, 12:50 PM

For CodeGen/X86/licm-dominance.ll I was specifically hoping that the test could be re-reduced with this change in place. I tried to create a test that would exercise the code in question, but I don't seem to understand licm well enough.

I just noticed that test/CodeGen/WebAssembly/mem-intrinsics.ll seems to be failing as well with this change.
If you can look at that as well, it would be awesome.

test/CodeGen/WebAssembly/mem-intrinsics.ll passes for me, with this patch applied. Let me know if there's anything I can help with.

Hi Kyle,

I think FallThrough does not have to be the CFG successor of MBB. It might be easier to understand it as the potential fallthrough block of either MBB or PrevBB.

Haicheng

lib/CodeGen/BranchFolding.cpp
1381

If you change FallThrough to MF.end(), I think you may miss this optimization.

This piece of code tries to optimize this case

PrevBB---MBB  FallThrough
      |____________|

Where MBB is the return block, PrevBB is both the layout predecessor and CFG predecessor of MBB, Fallthourgh is just a layout successor of MBB and must *not* be the CFG successor of MBB. In this case, MBB can be moved to the bottom of the MF and PrevBB can fallthrough to FallThrough.

1603

Similar situation here. Fallthrough is used to iterate all blocks below MBB in the layout to find the first non-EHPad. I think changing it to MF.end() may miss this one too.

iteratee updated this revision to Diff 57823.May 19 2016, 11:00 AM
iteratee removed rL LLVM as the repository for this revision.
iteratee marked 2 inline comments as done.

Narrowed the check for FallThrough being an actual successor to the one place where this was assumed.

Thanks Haicheng.

lib/CodeGen/BranchFolding.cpp
1381

I moved the test to the empty block removal, which does assume that FallThrough is a CFG successor.

That only leaves licm-dominance and Thumb2/v8_IT_5.ll

The optimizer seems to be doing reasonable things in both of those cases.

I don't really want to remove tests from the regression suite, but the tests seem to be relying on bad assumptions, and I'm not the best person to fix them.
I could disable the FileCheck lines and open bugs for them.

test/CodeGen/WebAssembly/mem-intrinsics.ll passes for me, with this patch applied. Let me know if there's anything I can help with.

Thanks for looking at that. It was another change, sorry for the noise.

Please see the inlined comment.

lib/CodeGen/BranchFolding.cpp
1258

I think we can enter here only if MBB is empty which means MBB does not have any branch. In this case, I think MBB has nowhere to go but fallthrough to FallThrough and this check is not necessary.

Barring comment from anyone else, I'm going to go ahead and remove

LLVM :: CodeGen/Thumb2/v8_IT_5.ll
LLVM :: CodeGen/X86/licm-dominance.ll

In licm-dominance it's clear that bugpoint has gone too far and the optimizer is being perfectly reasonable,
and it looks pretty reasonable for v8_IT_5.ll as well.

iteratee marked an inline comment as done.May 25 2016, 5:40 PM
iteratee added inline comments.
lib/CodeGen/BranchFolding.cpp
1258

This is actually what caused me to look into fallthrough in this file. An empty unreachable block has no successors, and replacing it with its layout successor can create invalid cfg edges.

haicheng added inline comments.May 26 2016, 7:25 AM
lib/CodeGen/BranchFolding.cpp
1258

That makes sense. Do you want to add a comment to explain this? Do you also want to create a test case for this? I think you don't have any test case yet to show what you want to fix.

mcrosier edited edge metadata.May 26 2016, 8:37 AM

AFAICT, all of the test updates are just side-effects of this patch. Therefore, I agree with Haicheng in that we need a test case that exposes the issue this patch is trying to address.

iteratee updated this revision to Diff 58663.May 26 2016, 12:06 PM
iteratee edited edge metadata.
iteratee marked an inline comment as done.

Added a test case, and another check where we weren't being explicit about cfg vs layout successor.

There's now a test for the codepath that handles empty blocks with no successors. I tested, and that seems to be the driver behind all the other test changes as well.

The two other code changes at 969 and 1270 were from looking over the file to see if there were other places where fallthrough was being assumed instead of actually tested.
I can split those out into a separate patch if you would prefer.

Hi Kyle,

Do you have test cases for your change in line 969 and 1270?

Haicheng

lib/CodeGen/BranchFolding.cpp
969

My understanding is that PredBB is just the layout predecessor of MBB. The CFG predecessors of MBB are stored in MergePotentials and they are compared with PredBB to see if any of them is also the layout predecessor. I know the debug info is misleading....

Do you have a test case that the current code miscompiles?

1270

Is this check you added covered by the rest existing conditions? Do you have a test case for this?

iteratee updated this revision to Diff 59613.Jun 3 2016, 1:50 PM

Removed check after realizing it was redundant.

iteratee added inline comments.Jun 3 2016, 1:56 PM
lib/CodeGen/BranchFolding.cpp
999

No, there is no test case that changes, and nothing in the test suite changes hashes.

However, the comments documenting the function disagree with you. See the comments on lines 796-797. The fact that nothing bad has happened so far isn't a good argument for not fixing the code to match the assumptions given in the comments.

1270

You're right it's covered. Looking at AnalyzeBranch, Cond.empty && !TBB implies fallthrough.

I'll remove it.

iteratee added inline comments.Jun 6 2016, 2:33 PM
lib/CodeGen/BranchFolding.cpp
999

I need to get this in, so if you are set against this check, I can pull it out of this patch and worry about it later.

Sorry for the late response.

Have you updated the broken test cases? I ran your change on my machine and I could pass test-sharedidx.ll always-ext.ll thumb2-ifcvt3.ll

lib/CodeGen/BranchFolding.cpp
999

Comment 796-797 emphasizes *if any*. So, maybe we can worry about it later.

test/CodeGen/X86/tail-merge-unreachable.ll
16–19

Not directly related to your change. Do you think tail merging should merge sw.bb and sw.bb2? I know tail merging currently cannot merge empty blocks or unconditional branch only blocks. I may work on it as next.

iteratee updated this revision to Diff 60118.Jun 8 2016, 4:58 PM

Revert changes in tests that no longer require them.

I updated the test cases as requested.

lib/CodeGen/BranchFolding.cpp
999

That to me implies that it will be null if there is no such predecessor.
I need to know if you want me to remove it from this patch.

haicheng added a comment.EditedJun 10 2016, 11:48 AM

I checked the modifications to the broken test cases.

I think you still break v8_IT_5.ll and licm-dominance.ll. What is your plan about these two?

Your modifications to missinglabel.ll and eh.ll are reasonable.

Your patch causes unnecessary branches in thumb2-cbnz.ll and br-fold.ll, but I think it is not your fault. I will try to create a patch to merge unreachable only blocks so that your change would not break these two.

lib/CodeGen/BranchFolding.cpp
999

Please remove it from this patch. I think it is not closely related to the rest of the change.

I understand your concern and do not have strong opinion about it. If you insist, it might be better to do it in your next patch.

iteratee updated this revision to Diff 60560.Jun 13 2016, 10:49 AM

Split out only the portion with observable behavior change.

iteratee updated this revision to Diff 60706.Jun 14 2016, 10:40 AM

Remove two tests where the optimizer is completely valid to remove the assumptions.

OK, the two failing tests are gone, I don't think they were valid to begin with. Are there any other concerns about this change?

mcrosier requested changes to this revision.Jun 15 2016, 8:12 AM
mcrosier edited edge metadata.

OK, the two failing tests are gone, I don't think they were valid to begin with. Are there any other concerns about this change?

In general, we don't want to reduce our test coverage. Please provide additional details as to why you believe these two tests aren't valid and why it's reasonable to delete them.

This revision now requires changes to proceed.Jun 15 2016, 8:12 AM

OK, the two failing tests are gone, I don't think they were valid to begin with. Are there any other concerns about this change?

OK, the two failing tests are gone, I don't think they were valid to begin with. Are there any other concerns about this change?

In general, we don't want to reduce our test coverage. Please provide additional details as to why you believe these two tests aren't valid and why it's reasonable to delete them.

In general, we want coverage by correct tests. Incorrect tests are a liability, not an asset.

OK, let's take them one at a time:
licm-dominance:
This test has NEVER been correct. I've checked the entire version history. The goal of the test is to verify that LICM checks the dominance relation to make sure a load is guaranteed to be executed. The problem with the test is that with all the undefineds, the compiler is free to make sure the load is guaranteed to be executed. The test also relies on the particular behavior of undefined as setting eax to 0

Thumb2/v8_IT_5.ll:
There are 4 other v8_it tests, but this one suffers from a similar problem as the one above. It has too many undefineds for the assumptions about the resulting code to ever be correct.

I feel I've given plenty of time for people to respond to the tests in question. On May 19, almost a month ago I said:
"""
That only leaves licm-dominance and Thumb2/v8_IT_5.ll

The optimizer seems to be doing reasonable things in both of those cases.

I don't really want to remove tests from the regression suite, but the tests seem to be relying on bad assumptions, and I'm not the best person to fix them.
I could disable the FileCheck lines and open bugs for them.
"""
and a week later on the 25th I said:
"""
Barring comment from anyone else, I'm going to go ahead and remove

LLVM :: CodeGen/Thumb2/v8_IT_5.ll
LLVM :: CodeGen/X86/licm-dominance.ll
In licm-dominance it's clear that bugpoint has gone too far and the optimizer is being perfectly reasonable,
and it looks pretty reasonable for v8_IT_5.ll as well.
"""
A comment from you then would have been very helpful.

OK, the two failing tests are gone, I don't think they were valid to begin with. Are there any other concerns about this change?

OK, the two failing tests are gone, I don't think they were valid to begin with. Are there any other concerns about this change?

In general, we don't want to reduce our test coverage. Please provide additional details as to why you believe these two tests aren't valid and why it's reasonable to delete them.

In general, we want coverage by correct tests. Incorrect tests are a liability, not an asset.

I agree. Please understand I just want to make sure we're doing our due diligence here. I'm not trying to impeded your progress.

OK, let's take them one at a time:
licm-dominance:
This test has NEVER been correct. I've checked the entire version history. The goal of the test is to verify that LICM checks the dominance relation to make sure a load is guaranteed to be executed. The problem with the test is that with all the undefineds, the compiler is free to make sure the load is guaranteed to be executed. The test also relies on the particular behavior of undefined as setting eax to 0

After further investigation I tend to agree that bugpoint was overly aggressive and the reduced test case doesn't appear to be actually testing anything.

Thumb2/v8_IT_5.ll:
There are 4 other v8_it tests, but this one suffers from a similar problem as the one above. It has too many undefineds for the assumptions about the resulting code to ever be correct.

I think v8_IT_5.ll could have been better written, but I believe it is testing what it is intended to test. Specifically, that we can predicate a tCMPi8 instruction.

I think the critical checks are:

; CHECK: it ne
; CHECK-NEXT: cmpne
; CHECK-NEXT: bne [[JUMPTARGET:.LBB[0-9]+_[0-9]+]]

If you edit isV8EligibleForIT() in ARMFeatures.h to return false for ARM::tCMPi8 you'll break this test, which is the regression we're trying to avoid. We'll generate code like the following:

cmp     r0, #3
beq     .LBB0_3

cmp     r0, #1
beq     .LBB0_3

...

IMO, we should figure out how to fix the test so it continues to test this behavior while passing with your patch.

I've reduced the test a little without changing the CHECKs. Let me know if this works with your patch.

; RUN: llc < %s -mtriple=thumbv8 -arm-atomic-cfg-tidy=0 | FileCheck %s
; RUN: llc < %s -mtriple=thumbv7 -arm-atomic-cfg-tidy=0 -arm-restrict-it | FileCheck %s
; CHECK: it	ne
; CHECK-NEXT: cmpne
; CHECK-NEXT: bne [[JUMPTARGET:.LBB[0-9]+_[0-9]+]]
; CHECK: cbz
; CHECK-NEXT: %if.else163
; CHECK-NEXT: mov.w
; CHECK-NEXT: b
; CHECK: [[JUMPTARGET]]:{{.*}}%if.else173
; CHECK-NEXT: mov.w
; CHECK-NEXT: bx lr
; CHECK-NEXT: %if.else145
; CHECK-NEXT: mov.w

%struct.hc = type { i32, i32, i32, i32 }

define i32 @t(i32 %type) optsize {
entry:
  switch i32 %type, label %if.else173 [
    i32 3, label %if.then115
    i32 1, label %if.then102
  ]

if.then102:
  unreachable

if.then115:
  br i1 undef, label %if.else163, label %if.else145

if.else145:
  %call150 = call fastcc %struct.hc* @foo(%struct.hc* undef, i32 34865152) optsize
  br label %while.body172

if.else163:
  %call168 = call fastcc %struct.hc* @foo(%struct.hc* undef, i32 34078720) optsize
  br label %while.body172

while.body172:
  br label %while.body172

if.else173:
  ret i32 -1
}

declare hidden fastcc %struct.hc* @foo(%struct.hc* nocapture, i32) nounwind optsize

Chad

I feel I've given plenty of time for people to respond to the tests in question. On May 19, almost a month ago I said:
"""
That only leaves licm-dominance and Thumb2/v8_IT_5.ll

The optimizer seems to be doing reasonable things in both of those cases.

I don't really want to remove tests from the regression suite, but the tests seem to be relying on bad assumptions, and I'm not the best person to fix them.
I could disable the FileCheck lines and open bugs for them.
"""
and a week later on the 25th I said:
"""
Barring comment from anyone else, I'm going to go ahead and remove

LLVM :: CodeGen/Thumb2/v8_IT_5.ll
LLVM :: CodeGen/X86/licm-dominance.ll
In licm-dominance it's clear that bugpoint has gone too far and the optimizer is being perfectly reasonable,
and it looks pretty reasonable for v8_IT_5.ll as well.
"""
A comment from you then would have been very helpful.

I have a fixed version of v8_IT_5.ll
Your short version made it obvious that they're relying on then102 and
then115 becoming the same destination.

Maybe you can help me with licm-dominance. I can't seem to write a load in
IR that ends up being considered as invariant to save my life.
https://xkcd.com/1168/

I can tag the loads as invariant, and then the flags are correct, but for
the load to be *actually* invariant it also has to be "dereferenceable".
I'm not certain what that means or how to write it into the IR. Pointers
would be useful.

Err, are you talking about https://github.com/llvm-mirror/llvm/blob/961fcb527d3c49bfcf9d6ff212cca3dc15682dbe/lib/CodeGen/MachineLICM.cpp#L869 ? You can't write a constant pool load in IR. You can write something like store <4 x i32> <i32 1, i32 2, i32 3, i32 4>, <4 x i32*> @g, and SelectionDAG will generate a constant pool load.

I figured this out with help from IRC.

iteratee updated this revision to Diff 61022.Jun 16 2016, 1:58 PM
iteratee edited edge metadata.

Replaced the two deleted tests, now better than ever.

mcrosier, thanks for your help with the v8 test. When you shrunk it, it was obvious what was intended, and what needed to change.
As an aside, we can't use 1 and 3 any more, as the compiler gets clever with constants, oring with 2 and then checking against 3. Hence 6 and 13.

OK, no removed tests, and no more dependent revisions.
Any other concerns?

It looks good to me now. Thank you for working on this, Kyle. I will create a patch to cleanup the extra branches later.

mcrosier accepted this revision.Jun 24 2016, 9:43 AM
mcrosier edited edge metadata.
This revision is now accepted and ready to land.Jun 24 2016, 9:43 AM
iteratee closed this revision.Jun 27 2016, 12:46 PM